Una serie temporal es una secuencia de observaciones recogidas en intervalos regulares de tiempo. A diferencia de otros tipos de datos, en una serie temporal el orden es fundamental, ya que los valores pasados influyen, a veces de forma compleja, en los valores futuros.
El modelado de series temporales permite:
Estas capacidades son fundamentales en sectores como:
| Enfoque | Características principales | Limitaciones |
|---|---|---|
| Modelos estadísticos (ARIMA, SARIMA, Exponential Smoothing) |
Interpretables, requieren supuestos sobre estacionalidad y linealidad | No capturan bien relaciones no lineales ni dependencias a largo plazo |
| Modelos clásicos de ML (Random Forest, XGBoost) |
Pueden capturar no linealidades con ingeniería de features | No aprovechan la secuencialidad de los datos explícitamente |
| Redes recurrentes (RNN, LSTM, GRU) | Modelan dependencias temporales, manejo secuencial de datos | Dificultad con secuencias largas, entrenamiento costoso |
| Transformers | Atención directa a toda la secuencia, paralelización eficiente | Requieren más datos y recursos computacionales |
Los Transformers revolucionaron el modelado secuencial en NLP, pero su capacidad de atención permite también capturar relaciones a largo plazo en series temporales, sin necesidad de procesar secuencialmente paso a paso como en las RNN. Algunas de sus ventajas clave son:
En este notebook exploraremos:
La naturaleza y preparación de los datos de series temporales. Se aborda cómo caracterizar, limpiar y transformar series temporales multivariadas para su uso en modelos de imputación y predicción.
El uso de SAITS (Self-Attention-based Imputation for Time Series). Se explora cómo emplear el modelo SAITS para imputar valores faltantes en series temporales, aprovechando mecanismos de atención para capturar relaciones complejas entre variables y momentos en el tiempo.
El uso de una arquitectura basada en Transformers para predecir valores futuros. Se implementa y entrena un Transformer adaptado para series temporales, analizando su capacidad para modelar dependencias de largo plazo y su ventaja frente a modelos tradicionales.
Métricas de evaluación y comparación con modelos tradicionales. Se utilizan métricas como RMSE y R² para evaluar la calidad de la imputación y de la predicción, discutiendo los resultados en comparación con métodos clásicos (por ejemplo, interpolación, MLP, LSTM).
Esta experiencia busca desarrollar una comprensión aplicada de cómo aprovechar la potencia de los Transformers en problemas reales de series temporales, y cómo superar las limitaciones de los enfoques tradicionales.
!pip install pypots==0.11
Collecting pypots==0.11
Downloading pypots-0.11-py3-none-any.whl.metadata (47 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 47.6/47.6 kB 2.0 MB/s eta 0:00:00
Requirement already satisfied: h5py in /usr/local/lib/python3.11/dist-packages (from pypots==0.11) (3.14.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.11/dist-packages (from pypots==0.11) (2.0.2)
Requirement already satisfied: scipy in /usr/local/lib/python3.11/dist-packages (from pypots==0.11) (1.15.3)
Requirement already satisfied: sympy in /usr/local/lib/python3.11/dist-packages (from pypots==0.11) (1.13.1)
Requirement already satisfied: einops in /usr/local/lib/python3.11/dist-packages (from pypots==0.11) (0.8.1)
Requirement already satisfied: pandas in /usr/local/lib/python3.11/dist-packages (from pypots==0.11) (2.2.2)
Requirement already satisfied: seaborn in /usr/local/lib/python3.11/dist-packages (from pypots==0.11) (0.13.2)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.11/dist-packages (from pypots==0.11) (3.10.0)
Requirement already satisfied: tensorboard in /usr/local/lib/python3.11/dist-packages (from pypots==0.11) (2.18.0)
Requirement already satisfied: scikit-learn in /usr/local/lib/python3.11/dist-packages (from pypots==0.11) (1.6.1)
Requirement already satisfied: transformers in /usr/local/lib/python3.11/dist-packages (from pypots==0.11) (4.52.4)
Requirement already satisfied: torch>=1.10.0 in /usr/local/lib/python3.11/dist-packages (from pypots==0.11) (2.6.0+cu124)
Collecting tsdb>=0.6.1 (from pypots==0.11)
Downloading tsdb-0.7.1-py3-none-any.whl.metadata (13 kB)
Collecting pygrinder>=0.7 (from pypots==0.11)
Downloading pygrinder-0.7-py3-none-any.whl.metadata (10 kB)
Collecting benchpots>=0.3.2 (from pypots==0.11)
Downloading benchpots-0.4-py3-none-any.whl.metadata (9.5 kB)
Collecting ai4ts (from pypots==0.11)
Downloading ai4ts-0.0.3-py3-none-any.whl.metadata (14 kB)
Requirement already satisfied: filelock in /usr/local/lib/python3.11/dist-packages (from torch>=1.10.0->pypots==0.11) (3.18.0)
Requirement already satisfied: typing-extensions>=4.10.0 in /usr/local/lib/python3.11/dist-packages (from torch>=1.10.0->pypots==0.11) (4.14.0)
Requirement already satisfied: networkx in /usr/local/lib/python3.11/dist-packages (from torch>=1.10.0->pypots==0.11) (3.5)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.11/dist-packages (from torch>=1.10.0->pypots==0.11) (3.1.6)
Requirement already satisfied: fsspec in /usr/local/lib/python3.11/dist-packages (from torch>=1.10.0->pypots==0.11) (2025.3.2)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.10.0->pypots==0.11)
Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.10.0->pypots==0.11)
Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.10.0->pypots==0.11)
Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.10.0->pypots==0.11)
Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=1.10.0->pypots==0.11)
Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch>=1.10.0->pypots==0.11)
Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.5.147 (from torch>=1.10.0->pypots==0.11)
Downloading nvidia_curand_cu12-10.3.5.147-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cusolver-cu12==11.6.1.9 (from torch>=1.10.0->pypots==0.11)
Downloading nvidia_cusolver_cu12-11.6.1.9-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cusparse-cu12==12.3.1.170 (from torch>=1.10.0->pypots==0.11)
Downloading nvidia_cusparse_cu12-12.3.1.170-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Requirement already satisfied: nvidia-cusparselt-cu12==0.6.2 in /usr/local/lib/python3.11/dist-packages (from torch>=1.10.0->pypots==0.11) (0.6.2)
Requirement already satisfied: nvidia-nccl-cu12==2.21.5 in /usr/local/lib/python3.11/dist-packages (from torch>=1.10.0->pypots==0.11) (2.21.5)
Requirement already satisfied: nvidia-nvtx-cu12==12.4.127 in /usr/local/lib/python3.11/dist-packages (from torch>=1.10.0->pypots==0.11) (12.4.127)
Collecting nvidia-nvjitlink-cu12==12.4.127 (from torch>=1.10.0->pypots==0.11)
Downloading nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Requirement already satisfied: triton==3.2.0 in /usr/local/lib/python3.11/dist-packages (from torch>=1.10.0->pypots==0.11) (3.2.0)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.11/dist-packages (from sympy->pypots==0.11) (1.3.0)
Requirement already satisfied: tqdm in /usr/local/lib/python3.11/dist-packages (from tsdb>=0.6.1->pypots==0.11) (4.67.1)
Requirement already satisfied: pyarrow in /usr/local/lib/python3.11/dist-packages (from tsdb>=0.6.1->pypots==0.11) (18.1.0)
Requirement already satisfied: requests in /usr/local/lib/python3.11/dist-packages (from tsdb>=0.6.1->pypots==0.11) (2.32.3)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->pypots==0.11) (1.3.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.11/dist-packages (from matplotlib->pypots==0.11) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib->pypots==0.11) (4.58.4)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->pypots==0.11) (1.4.8)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib->pypots==0.11) (24.2)
Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.11/dist-packages (from matplotlib->pypots==0.11) (11.2.1)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->pypots==0.11) (3.2.3)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.11/dist-packages (from matplotlib->pypots==0.11) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.11/dist-packages (from pandas->pypots==0.11) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.11/dist-packages (from pandas->pypots==0.11) (2025.2)
Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn->pypots==0.11) (1.5.1)
Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn->pypots==0.11) (3.6.0)
Requirement already satisfied: absl-py>=0.4 in /usr/local/lib/python3.11/dist-packages (from tensorboard->pypots==0.11) (1.4.0)
Requirement already satisfied: grpcio>=1.48.2 in /usr/local/lib/python3.11/dist-packages (from tensorboard->pypots==0.11) (1.73.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.11/dist-packages (from tensorboard->pypots==0.11) (3.8.2)
Requirement already satisfied: protobuf!=4.24.0,>=3.19.6 in /usr/local/lib/python3.11/dist-packages (from tensorboard->pypots==0.11) (5.29.5)
Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.11/dist-packages (from tensorboard->pypots==0.11) (75.2.0)
Requirement already satisfied: six>1.9 in /usr/local/lib/python3.11/dist-packages (from tensorboard->pypots==0.11) (1.17.0)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /usr/local/lib/python3.11/dist-packages (from tensorboard->pypots==0.11) (0.7.2)
Requirement already satisfied: werkzeug>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from tensorboard->pypots==0.11) (3.1.3)
Requirement already satisfied: huggingface-hub<1.0,>=0.30.0 in /usr/local/lib/python3.11/dist-packages (from transformers->pypots==0.11) (0.33.0)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.11/dist-packages (from transformers->pypots==0.11) (6.0.2)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.11/dist-packages (from transformers->pypots==0.11) (2024.11.6)
Requirement already satisfied: tokenizers<0.22,>=0.21 in /usr/local/lib/python3.11/dist-packages (from transformers->pypots==0.11) (0.21.2)
Requirement already satisfied: safetensors>=0.4.3 in /usr/local/lib/python3.11/dist-packages (from transformers->pypots==0.11) (0.5.3)
Requirement already satisfied: hf-xet<2.0.0,>=1.1.2 in /usr/local/lib/python3.11/dist-packages (from huggingface-hub<1.0,>=0.30.0->transformers->pypots==0.11) (1.1.5)
Requirement already satisfied: MarkupSafe>=2.1.1 in /usr/local/lib/python3.11/dist-packages (from werkzeug>=1.0.1->tensorboard->pypots==0.11) (3.0.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests->tsdb>=0.6.1->pypots==0.11) (3.4.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/dist-packages (from requests->tsdb>=0.6.1->pypots==0.11) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests->tsdb>=0.6.1->pypots==0.11) (2.4.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.11/dist-packages (from requests->tsdb>=0.6.1->pypots==0.11) (2025.6.15)
Downloading pypots-0.11-py3-none-any.whl (597 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 597.5/597.5 kB 11.9 MB/s eta 0:00:00
Downloading benchpots-0.4-py3-none-any.whl (30 kB)
Downloading pygrinder-0.7-py3-none-any.whl (24 kB)
Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl (363.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 363.4/363.4 MB 1.4 MB/s eta 0:00:00
Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (13.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.8/13.8 MB 102.3 MB/s eta 0:00:00
Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (24.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24.6/24.6 MB 81.8 MB/s eta 0:00:00
Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (883 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 883.7/883.7 kB 50.3 MB/s eta 0:00:00
Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl (664.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 664.8/664.8 MB 2.9 MB/s eta 0:00:00
Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl (211.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 211.5/211.5 MB 3.9 MB/s eta 0:00:00
Downloading nvidia_curand_cu12-10.3.5.147-py3-none-manylinux2014_x86_64.whl (56.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.3/56.3 MB 7.4 MB/s eta 0:00:00
Downloading nvidia_cusolver_cu12-11.6.1.9-py3-none-manylinux2014_x86_64.whl (127.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 127.9/127.9 MB 7.3 MB/s eta 0:00:00
Downloading nvidia_cusparse_cu12-12.3.1.170-py3-none-manylinux2014_x86_64.whl (207.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 207.5/207.5 MB 5.6 MB/s eta 0:00:00
Downloading nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (21.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.1/21.1 MB 60.4 MB/s eta 0:00:00
Downloading tsdb-0.7.1-py3-none-any.whl (32 kB)
Downloading ai4ts-0.0.3-py3-none-any.whl (13 kB)
Installing collected packages: nvidia-nvjitlink-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, ai4ts, nvidia-cusparse-cu12, nvidia-cudnn-cu12, tsdb, nvidia-cusolver-cu12, pygrinder, benchpots, pypots
Attempting uninstall: nvidia-nvjitlink-cu12
Found existing installation: nvidia-nvjitlink-cu12 12.5.82
Uninstalling nvidia-nvjitlink-cu12-12.5.82:
Successfully uninstalled nvidia-nvjitlink-cu12-12.5.82
Attempting uninstall: nvidia-curand-cu12
Found existing installation: nvidia-curand-cu12 10.3.6.82
Uninstalling nvidia-curand-cu12-10.3.6.82:
Successfully uninstalled nvidia-curand-cu12-10.3.6.82
Attempting uninstall: nvidia-cufft-cu12
Found existing installation: nvidia-cufft-cu12 11.2.3.61
Uninstalling nvidia-cufft-cu12-11.2.3.61:
Successfully uninstalled nvidia-cufft-cu12-11.2.3.61
Attempting uninstall: nvidia-cuda-runtime-cu12
Found existing installation: nvidia-cuda-runtime-cu12 12.5.82
Uninstalling nvidia-cuda-runtime-cu12-12.5.82:
Successfully uninstalled nvidia-cuda-runtime-cu12-12.5.82
Attempting uninstall: nvidia-cuda-nvrtc-cu12
Found existing installation: nvidia-cuda-nvrtc-cu12 12.5.82
Uninstalling nvidia-cuda-nvrtc-cu12-12.5.82:
Successfully uninstalled nvidia-cuda-nvrtc-cu12-12.5.82
Attempting uninstall: nvidia-cuda-cupti-cu12
Found existing installation: nvidia-cuda-cupti-cu12 12.5.82
Uninstalling nvidia-cuda-cupti-cu12-12.5.82:
Successfully uninstalled nvidia-cuda-cupti-cu12-12.5.82
Attempting uninstall: nvidia-cublas-cu12
Found existing installation: nvidia-cublas-cu12 12.5.3.2
Uninstalling nvidia-cublas-cu12-12.5.3.2:
Successfully uninstalled nvidia-cublas-cu12-12.5.3.2
Attempting uninstall: nvidia-cusparse-cu12
Found existing installation: nvidia-cusparse-cu12 12.5.1.3
Uninstalling nvidia-cusparse-cu12-12.5.1.3:
Successfully uninstalled nvidia-cusparse-cu12-12.5.1.3
Attempting uninstall: nvidia-cudnn-cu12
Found existing installation: nvidia-cudnn-cu12 9.3.0.75
Uninstalling nvidia-cudnn-cu12-9.3.0.75:
Successfully uninstalled nvidia-cudnn-cu12-9.3.0.75
Attempting uninstall: nvidia-cusolver-cu12
Found existing installation: nvidia-cusolver-cu12 11.6.3.83
Uninstalling nvidia-cusolver-cu12-11.6.3.83:
Successfully uninstalled nvidia-cusolver-cu12-11.6.3.83
Successfully installed ai4ts-0.0.3 benchpots-0.4 nvidia-cublas-cu12-12.4.5.8 nvidia-cuda-cupti-cu12-12.4.127 nvidia-cuda-nvrtc-cu12-12.4.127 nvidia-cuda-runtime-cu12-12.4.127 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.2.1.3 nvidia-curand-cu12-10.3.5.147 nvidia-cusolver-cu12-11.6.1.9 nvidia-cusparse-cu12-12.3.1.170 nvidia-nvjitlink-cu12-12.4.127 pygrinder-0.7 pypots-0.11 tsdb-0.7.1
import missingno as msno
import pandas as pd
import numpy as np
from pypots.imputation import SAITS
# Load the dataset
df = pd.read_csv('https://raw.githubusercontent.com/marsgr6/r-scripts/refs/heads/master/data/dataset2_cuenca.csv')
df
| FECHA | QH0889 | QH0894 | PRECIPITACION M0141 | PRECIPITACION M0501 | PRECIPITACION M0431 | SST | SPI M0141 | SPI M0501 | SPI M0431 | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1980-01 | 54.521 | 15.784 | 119.3 | 245.5 | NaN | 27.16 | 0.37 | 0.88 | NaN |
| 1 | 1980-02 | 54.666 | 30.408 | 120.2 | 31.4 | NaN | 27.57 | 0.39 | -1.51 | NaN |
| 2 | 1980-03 | 120.460 | 37.035 | 128.8 | 161.2 | NaN | 27.85 | 0.57 | -0.06 | NaN |
| 3 | 1980-04 | 110.465 | 85.364 | 156.6 | 224.3 | NaN | 28.18 | 1.16 | 0.65 | NaN |
| 4 | 1980-05 | 92.278 | 46.099 | 75.3 | 82.8 | NaN | 28.17 | -0.56 | -0.93 | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 463 | 2018-08 | 61.781 | 41.290 | NaN | NaN | NaN | 27.53 | NaN | NaN | NaN |
| 464 | 2018-09 | 61.781 | 41.290 | NaN | NaN | NaN | 27.61 | NaN | NaN | NaN |
| 465 | 2018-10 | 61.781 | 41.290 | NaN | NaN | NaN | 27.73 | NaN | NaN | NaN |
| 466 | 2018-11 | 61.781 | 41.290 | NaN | NaN | NaN | 27.83 | NaN | NaN | NaN |
| 467 | 2018-12 | 61.781 | 41.290 | NaN | NaN | NaN | 27.86 | NaN | NaN | NaN |
468 rows × 10 columns
La imagen a continuación muestra un diagrama de valores faltantes generado comúnmente con bibliotecas como missingno en Python. Este tipo de visualización es muy útil para entender la estructura y patrón de los datos ausentes en un conjunto multivariado de series temporales.
Cada fila representa una instancia temporal (por ejemplo, una fecha específica), y cada columna representa una variable (como una estación meteorológica o un índice climático).
Variables completas:
FECHA, QH0889 y SST no presentan valores faltantes (100% negro), lo cual es positivo, especialmente FECHA, que suele usarse como índice temporal.Variables parcialmente incompletas:
QH0894, PRECIPITACION M0141, SPI M0501, etc., presentan bloques de datos faltantes, algunos de forma intermitente.Patrones comunes:
SPI M0141 y SPI M0431), lo cual indica posibles relaciones o dependencias estructurales en la forma en que se recolectan.Distribución en el tiempo:
La cantidad y el patrón de los faltantes afectarán la elección del método de imputación.
Usaremos SAITS para imputar estos valores faltantes de forma eficiente y robusta, preservando la estructura temporal y las correlaciones entre variables.
msno.matrix(df)
<Axes: >
Antes de aplicar modelos de predicción, es crucial imputar valores faltantes en series temporales, ya que muchos algoritmos requieren entradas completas para funcionar correctamente. En este notebook, comenzaremos utilizando el modelo SAITS (Self-Attention-based Imputation for Time Series), una arquitectura basada en Transformers específicamente diseñada para imputación multivariada de series temporales.
SAITS es una arquitectura de red neuronal profunda que utiliza mecanismos de auto-atención (self-attention) para aprender patrones temporales y de dependencia cruzada entre variables. Fue propuesta para abordar los desafíos comunes de imputación en series temporales:
SAITS consta de tres bloques principales:
### Codificador con atención bidireccional
### Decodificador de imputación
### Reconstrucción
En este notebook, aplicaremos SAITS como primer paso para:
Una vez imputados los datos con SAITS, avanzaremos hacia la construcción de un modelo de predicción temporal sobre estos datos completos.
# Data preprocessing. Tedious, but PyPOTS can help.
from sklearn.preprocessing import StandardScaler
from pygrinder import mcar
from pypots.imputation import SAITS
from pypots.utils.metrics import calc_mae
# Convert 'FECHA' to datetime and set as index
df = pd.read_csv('https://raw.githubusercontent.com/marsgr6/r-scripts/refs/heads/master/data/dataset2_cuenca.csv')
df['FECHA'] = pd.to_datetime(df['FECHA'], format='%Y-%m')
df.set_index('FECHA', inplace=True)
# Parámetros
seq_len = 36 # Longitud de las ventanas en meses
n_features = len(df.columns) # Número de variables
# Convertir los datos en un array numpy
data = df.to_numpy(dtype=np.float32)
# Crear ventanas deslizantes (temporalidad)
n_samples = len(data) - seq_len + 1
X = np.array([data[i:i + seq_len] for i in range(n_samples)])
X.shape
(433, 36, 9)
# Compute the number of times each row should be repeated
repeat_factor = data.shape[0] // X.shape[0] # repetitions
extra_rows = data.shape[0] % X.shape[0] # additional rows needed
# Repeat all rows equally
expanded_arr = np.repeat(X, repeat_factor, axis=0)
# Add extra rows from the beginning to match 468
expanded_arr = np.vstack([expanded_arr, X[:extra_rows]])
expanded_arr.shape
(468, 36, 9)
expanded_arr[-extra_rows:,0,:] = data[-extra_rows:]
expanded_arr.shape
(468, 36, 9)
from sklearn.preprocessing import MinMaxScaler
# Inicializar normalizador para cada feature
scaler = MinMaxScaler()
X = expanded_arr.copy()
# Reajustar la forma para normalizar correctamente (colapsando las dimensiones temporales y de muestras)
X_reshaped = X.reshape(-1, X.shape[-1]) # Convertimos a (total_rows, n_features)
# Ajustar y transformar los datos
X_scaled = scaler.fit_transform(X_reshaped)
# Restaurar la forma original (n_samples, seq_len, n_features)
X_scaled = X_scaled.reshape(X.shape)
print("Shape de X escalado:", X_scaled.shape)
Shape de X escalado: (468, 36, 9)
# Initialize SAITS model
saits = SAITS(n_steps=seq_len, n_features=n_features,
n_layers=2, d_model=256, d_ffn=128,
n_heads=4, d_k=64, d_v=64, dropout=0.1, epochs=100)
dataset = {"X": X_scaled}
# Train the model
# Model training. This is PyPOTS showtime.
# Here I use the whole dataset as the training set because ground truth is not visible to the model, you can also split it into train/val/test sets
saits.fit(dataset) # train the model on the dataset
imputation = saits.impute(dataset) # impute the originally-missing values
2025-06-27 23:09:51 [INFO]: No given device, using default device: cpu 2025-06-27 23:09:51 [WARNING]: ‼️ saving_path not given. Model files and tensorboard file will not be saved. 2025-06-27 23:09:51 [INFO]: SAITS initialized with the given hyperparameters, the number of trainable parameters: 1,331,210 2025-06-27 23:09:56 [INFO]: Epoch 001 - training loss (MSE): 0.2470 2025-06-27 23:09:59 [INFO]: Epoch 002 - training loss (MSE): 0.1160 2025-06-27 23:10:02 [INFO]: Epoch 003 - training loss (MSE): 0.1013 2025-06-27 23:10:07 [INFO]: Epoch 004 - training loss (MSE): 0.0919 2025-06-27 23:10:13 [INFO]: Epoch 005 - training loss (MSE): 0.0841 2025-06-27 23:10:17 [INFO]: Epoch 006 - training loss (MSE): 0.0745 2025-06-27 23:10:21 [INFO]: Epoch 007 - training loss (MSE): 0.0686 2025-06-27 23:10:24 [INFO]: Epoch 008 - training loss (MSE): 0.0637 2025-06-27 23:10:28 [INFO]: Epoch 009 - training loss (MSE): 0.0585 2025-06-27 23:10:31 [INFO]: Epoch 010 - training loss (MSE): 0.0541 2025-06-27 23:10:36 [INFO]: Epoch 011 - training loss (MSE): 0.0507 2025-06-27 23:10:39 [INFO]: Epoch 012 - training loss (MSE): 0.0472 2025-06-27 23:10:42 [INFO]: Epoch 013 - training loss (MSE): 0.0443 2025-06-27 23:10:47 [INFO]: Epoch 014 - training loss (MSE): 0.0420 2025-06-27 23:10:50 [INFO]: Epoch 015 - training loss (MSE): 0.0395 2025-06-27 23:10:54 [INFO]: Epoch 016 - training loss (MSE): 0.0378 2025-06-27 23:10:57 [INFO]: Epoch 017 - training loss (MSE): 0.0359 2025-06-27 23:11:01 [INFO]: Epoch 018 - training loss (MSE): 0.0334 2025-06-27 23:11:05 [INFO]: Epoch 019 - training loss (MSE): 0.0323 2025-06-27 23:11:08 [INFO]: Epoch 020 - training loss (MSE): 0.0309 2025-06-27 23:11:13 [INFO]: Epoch 021 - training loss (MSE): 0.0300 2025-06-27 23:11:16 [INFO]: Epoch 022 - training loss (MSE): 0.0286 2025-06-27 23:11:19 [INFO]: Epoch 023 - training loss (MSE): 0.0279 2025-06-27 23:11:23 [INFO]: Epoch 024 - training loss (MSE): 0.0268 2025-06-27 23:11:27 [INFO]: Epoch 025 - training loss (MSE): 0.0257 2025-06-27 23:11:31 [INFO]: Epoch 026 - training loss (MSE): 0.0257 2025-06-27 23:11:34 [INFO]: Epoch 027 - training loss (MSE): 0.0249 2025-06-27 23:11:38 [INFO]: Epoch 028 - training loss (MSE): 0.0240 2025-06-27 23:11:42 [INFO]: Epoch 029 - training loss (MSE): 0.0242 2025-06-27 23:11:45 [INFO]: Epoch 030 - training loss (MSE): 0.0234 2025-06-27 23:11:49 [INFO]: Epoch 031 - training loss (MSE): 0.0237 2025-06-27 23:11:53 [INFO]: Epoch 032 - training loss (MSE): 0.0226 2025-06-27 23:11:56 [INFO]: Epoch 033 - training loss (MSE): 0.0227 2025-06-27 23:12:00 [INFO]: Epoch 034 - training loss (MSE): 0.0223 2025-06-27 23:12:04 [INFO]: Epoch 035 - training loss (MSE): 0.0214 2025-06-27 23:12:07 [INFO]: Epoch 036 - training loss (MSE): 0.0215 2025-06-27 23:12:11 [INFO]: Epoch 037 - training loss (MSE): 0.0213 2025-06-27 23:12:15 [INFO]: Epoch 038 - training loss (MSE): 0.0208 2025-06-27 23:12:19 [INFO]: Epoch 039 - training loss (MSE): 0.0204 2025-06-27 23:12:22 [INFO]: Epoch 040 - training loss (MSE): 0.0200 2025-06-27 23:12:25 [INFO]: Epoch 041 - training loss (MSE): 0.0200 2025-06-27 23:12:30 [INFO]: Epoch 042 - training loss (MSE): 0.0202 2025-06-27 23:12:33 [INFO]: Epoch 043 - training loss (MSE): 0.0204 2025-06-27 23:12:36 [INFO]: Epoch 044 - training loss (MSE): 0.0196 2025-06-27 23:12:41 [INFO]: Epoch 045 - training loss (MSE): 0.0196 2025-06-27 23:12:44 [INFO]: Epoch 046 - training loss (MSE): 0.0194 2025-06-27 23:12:48 [INFO]: Epoch 047 - training loss (MSE): 0.0193 2025-06-27 23:12:51 [INFO]: Epoch 048 - training loss (MSE): 0.0190 2025-06-27 23:12:55 [INFO]: Epoch 049 - training loss (MSE): 0.0186 2025-06-27 23:12:59 [INFO]: Epoch 050 - training loss (MSE): 0.0183 2025-06-27 23:13:02 [INFO]: Epoch 051 - training loss (MSE): 0.0187 2025-06-27 23:13:06 [INFO]: Epoch 052 - training loss (MSE): 0.0183 2025-06-27 23:13:10 [INFO]: Epoch 053 - training loss (MSE): 0.0181 2025-06-27 23:13:13 [INFO]: Epoch 054 - training loss (MSE): 0.0182 2025-06-27 23:13:17 [INFO]: Epoch 055 - training loss (MSE): 0.0179 2025-06-27 23:13:21 [INFO]: Epoch 056 - training loss (MSE): 0.0178 2025-06-27 23:13:25 [INFO]: Epoch 057 - training loss (MSE): 0.0179 2025-06-27 23:13:28 [INFO]: Epoch 058 - training loss (MSE): 0.0183 2025-06-27 23:13:32 [INFO]: Epoch 059 - training loss (MSE): 0.0177 2025-06-27 23:13:36 [INFO]: Epoch 060 - training loss (MSE): 0.0175 2025-06-27 23:13:39 [INFO]: Epoch 061 - training loss (MSE): 0.0172 2025-06-27 23:13:43 [INFO]: Epoch 062 - training loss (MSE): 0.0168 2025-06-27 23:13:47 [INFO]: Epoch 063 - training loss (MSE): 0.0171 2025-06-27 23:13:50 [INFO]: Epoch 064 - training loss (MSE): 0.0167 2025-06-27 23:13:54 [INFO]: Epoch 065 - training loss (MSE): 0.0168 2025-06-27 23:13:58 [INFO]: Epoch 066 - training loss (MSE): 0.0164 2025-06-27 23:14:02 [INFO]: Epoch 067 - training loss (MSE): 0.0165 2025-06-27 23:14:05 [INFO]: Epoch 068 - training loss (MSE): 0.0164 2025-06-27 23:14:09 [INFO]: Epoch 069 - training loss (MSE): 0.0163 2025-06-27 23:14:13 [INFO]: Epoch 070 - training loss (MSE): 0.0160 2025-06-27 23:14:16 [INFO]: Epoch 071 - training loss (MSE): 0.0165 2025-06-27 23:14:20 [INFO]: Epoch 072 - training loss (MSE): 0.0160 2025-06-27 23:14:24 [INFO]: Epoch 073 - training loss (MSE): 0.0158 2025-06-27 23:14:27 [INFO]: Epoch 074 - training loss (MSE): 0.0160 2025-06-27 23:14:31 [INFO]: Epoch 075 - training loss (MSE): 0.0155 2025-06-27 23:14:35 [INFO]: Epoch 076 - training loss (MSE): 0.0156 2025-06-27 23:14:38 [INFO]: Epoch 077 - training loss (MSE): 0.0158 2025-06-27 23:14:42 [INFO]: Epoch 078 - training loss (MSE): 0.0156 2025-06-27 23:14:45 [INFO]: Epoch 079 - training loss (MSE): 0.0154 2025-06-27 23:14:49 [INFO]: Epoch 080 - training loss (MSE): 0.0154 2025-06-27 23:14:53 [INFO]: Epoch 081 - training loss (MSE): 0.0152 2025-06-27 23:14:56 [INFO]: Epoch 082 - training loss (MSE): 0.0150 2025-06-27 23:15:00 [INFO]: Epoch 083 - training loss (MSE): 0.0151 2025-06-27 23:15:04 [INFO]: Epoch 084 - training loss (MSE): 0.0152 2025-06-27 23:15:07 [INFO]: Epoch 085 - training loss (MSE): 0.0150 2025-06-27 23:15:11 [INFO]: Epoch 086 - training loss (MSE): 0.0150 2025-06-27 23:15:15 [INFO]: Epoch 087 - training loss (MSE): 0.0151 2025-06-27 23:15:19 [INFO]: Epoch 088 - training loss (MSE): 0.0151 2025-06-27 23:15:22 [INFO]: Epoch 089 - training loss (MSE): 0.0148 2025-06-27 23:15:27 [INFO]: Epoch 090 - training loss (MSE): 0.0146 2025-06-27 23:15:30 [INFO]: Epoch 091 - training loss (MSE): 0.0151 2025-06-27 23:15:34 [INFO]: Epoch 092 - training loss (MSE): 0.0149 2025-06-27 23:15:38 [INFO]: Epoch 093 - training loss (MSE): 0.0150 2025-06-27 23:15:41 [INFO]: Epoch 094 - training loss (MSE): 0.0147 2025-06-27 23:15:45 [INFO]: Epoch 095 - training loss (MSE): 0.0146 2025-06-27 23:15:48 [INFO]: Epoch 096 - training loss (MSE): 0.0143 2025-06-27 23:15:53 [INFO]: Epoch 097 - training loss (MSE): 0.0142 2025-06-27 23:15:56 [INFO]: Epoch 098 - training loss (MSE): 0.0144 2025-06-27 23:15:59 [INFO]: Epoch 099 - training loss (MSE): 0.0141 2025-06-27 23:16:04 [INFO]: Epoch 100 - training loss (MSE): 0.0141 2025-06-27 23:16:04 [INFO]: Finished training. The best model is from epoch#99.
imputation.shape
(468, 36, 9)
# Asegúrate de reshaping correcto para la desnormalización
imputation_reshaped = imputation.reshape(-1, imputation.shape[-1]) # (n_samples * seq_len, n_features)
# Desnormalizar
imputation_denorm = scaler.inverse_transform(imputation_reshaped)
# Restaurar la forma original
imputation_denorm = imputation_denorm.reshape(imputation.shape)
imputation_denorm.shape
(468, 36, 9)
# Extraer datos imputados para el intervalo faltante
imputed_values = imputation_denorm[:, 0, :]
imputed_values.shape
(468, 9)
data_imputed = pd.DataFrame(imputed_values, columns=df.columns, index=df.index[:imputed_values.shape[0]])
data_imputed
| QH0889 | QH0894 | PRECIPITACION M0141 | PRECIPITACION M0501 | PRECIPITACION M0431 | SST | SPI M0141 | SPI M0501 | SPI M0431 | |
|---|---|---|---|---|---|---|---|---|---|
| FECHA | |||||||||
| 1980-01-01 | 54.521000 | 15.784000 | 119.300011 | 245.500000 | 116.374535 | 27.160000 | 0.370000 | 0.880000 | -0.003956 |
| 1980-02-01 | 54.666000 | 30.408001 | 120.200005 | 31.400000 | 104.869133 | 27.570000 | 0.390000 | -1.510000 | -0.196275 |
| 1980-03-01 | 120.459999 | 37.035000 | 128.800003 | 161.199997 | 143.007751 | 27.850000 | 0.570000 | -0.060000 | 0.423040 |
| 1980-04-01 | 110.464996 | 85.363998 | 156.600006 | 224.300003 | 171.080978 | 28.180000 | 1.160000 | 0.650000 | 0.908974 |
| 1980-05-01 | 92.278000 | 46.098999 | 75.300003 | 82.800003 | 126.155746 | 28.170000 | -0.560000 | -0.930000 | 0.167577 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2018-08-01 | 61.780998 | 41.290001 | 113.868813 | 155.213623 | 183.009415 | 27.530001 | 0.297547 | -0.168570 | 0.509420 |
| 2018-09-01 | 61.780998 | 41.290001 | 118.099625 | 168.537689 | 178.279190 | 27.610001 | 0.386337 | -0.036009 | 0.467775 |
| 2018-10-01 | 61.780998 | 41.290001 | 119.560310 | 162.745377 | 180.670410 | 27.730000 | 0.408620 | -0.086619 | 0.464481 |
| 2018-11-01 | 61.780998 | 41.290001 | 118.191536 | 166.688354 | 181.989395 | 27.830000 | 0.374286 | -0.053477 | 0.427797 |
| 2018-12-01 | 61.780998 | 41.290001 | 118.593040 | 162.047409 | 169.233963 | 27.860001 | 0.375263 | -0.113561 | 0.263998 |
468 rows × 9 columns
msno.matrix(data_imputed)
<Axes: >
La imagen a continuación muestra una serie de gráficos de líneas que comparan las series temporales originales (en azul) con los valores imputados (en naranja punteado) por un modelo, probablemente SAITS, tal como se mencionó antes. Aquí te dejo una descripción breve de las series imputadas, destacando el comportamiento general y la calidad de la imputación:
QH0889
QH0894
PRECIPITACIÓN M0141
PRECIPITACIÓN M0501
PRECIPITACIÓN M0431
SST (Temperatura superficial del mar)
SPI M0141, M0501 y M0431 (Índices estandarizados de precipitación)
import seaborn as sns
sns.set_context('paper')
import matplotlib.pyplot as plt
plt.figure(figsize=(15, 40))
for i,col in enumerate(df.columns):
plt.subplot(9,1,i+1)
plt.plot(df.index, df[col], label="Original")
plt.plot(data_imputed.index, data_imputed[col], ':', label="Imputed")
plt.title(col)
plt.legend()
data_imputed.to_csv('dataset3_imputed.csv')
import pandas as pd
# Read the CSV file with the specified date format and dayfirst=True
data_imputed = pd.read_csv('https://raw.githubusercontent.com/marsgr6/r-scripts/refs/heads/master/data/viz_data/ts_data_precipitation.csv',
parse_dates=[0], dayfirst=True, index_col=0)
# Display the first few rows of the DataFrame to verify
data_imputed.head()
| QH0889 | QH0894 | PM0141 | PM0501 | PM0431 | SST | SPI M0141 | SPI M0501 | SPI M0431 | |
|---|---|---|---|---|---|---|---|---|---|
| FECHA | |||||||||
| 1980-01-01 | 54.52 | 15.78 | 119.3 | 245.5 | 105.94 | 27.16 | 0.37 | 0.88 | -0.30 |
| 1980-02-01 | 54.67 | 30.41 | 120.2 | 31.4 | 92.91 | 27.57 | 0.39 | -1.51 | -0.51 |
| 1980-03-01 | 120.46 | 37.04 | 128.8 | 161.2 | 118.66 | 27.85 | 0.57 | -0.06 | -0.06 |
| 1980-04-01 | 110.47 | 85.36 | 156.6 | 224.3 | 138.17 | 28.18 | 1.16 | 0.65 | 0.25 |
| 1980-05-01 | 92.28 | 46.10 | 75.3 | 82.8 | 103.27 | 28.17 | -0.56 | -0.93 | -0.28 |
# Convert 'FECHA' to datetime and set as index
df = pd.read_csv('https://raw.githubusercontent.com/marsgr6/r-scripts/refs/heads/master/data/dataset2_cuenca.csv')
df['FECHA'] = pd.to_datetime(df['FECHA'], format='%Y-%m')
df.set_index('FECHA', inplace=True)
df.head()
| QH0889 | QH0894 | PRECIPITACION M0141 | PRECIPITACION M0501 | PRECIPITACION M0431 | SST | SPI M0141 | SPI M0501 | SPI M0431 | |
|---|---|---|---|---|---|---|---|---|---|
| FECHA | |||||||||
| 1980-01-01 | 54.521 | 15.784 | 119.3 | 245.5 | NaN | 27.16 | 0.37 | 0.88 | NaN |
| 1980-02-01 | 54.666 | 30.408 | 120.2 | 31.4 | NaN | 27.57 | 0.39 | -1.51 | NaN |
| 1980-03-01 | 120.460 | 37.035 | 128.8 | 161.2 | NaN | 27.85 | 0.57 | -0.06 | NaN |
| 1980-04-01 | 110.465 | 85.364 | 156.6 | 224.3 | NaN | 28.18 | 1.16 | 0.65 | NaN |
| 1980-05-01 | 92.278 | 46.099 | 75.3 | 82.8 | NaN | 28.17 | -0.56 | -0.93 | NaN |
# prompt: name the first 9 columns of df using the columns of data_imputed
# Assuming 'df' and 'data_imputed' are already defined as in your provided code.
# Check if the number of columns in 'data_imputed' is at least 9
if len(data_imputed.columns) >= 9:
# Rename the first 9 columns of 'df'
df.columns = list(data_imputed.columns[:9]) + list(df.columns[9:])
else:
print("Error: 'data_imputed' does not have at least 9 columns.")
df.head()
| QH0889 | QH0894 | PM0141 | PM0501 | PM0431 | SST | SPI M0141 | SPI M0501 | SPI M0431 | |
|---|---|---|---|---|---|---|---|---|---|
| FECHA | |||||||||
| 1980-01-01 | 54.521 | 15.784 | 119.3 | 245.5 | NaN | 27.16 | 0.37 | 0.88 | NaN |
| 1980-02-01 | 54.666 | 30.408 | 120.2 | 31.4 | NaN | 27.57 | 0.39 | -1.51 | NaN |
| 1980-03-01 | 120.460 | 37.035 | 128.8 | 161.2 | NaN | 27.85 | 0.57 | -0.06 | NaN |
| 1980-04-01 | 110.465 | 85.364 | 156.6 | 224.3 | NaN | 28.18 | 1.16 | 0.65 | NaN |
| 1980-05-01 | 92.278 | 46.099 | 75.3 | 82.8 | NaN | 28.17 | -0.56 | -0.93 | NaN |
import seaborn as sns
sns.set_context('paper')
import matplotlib.pyplot as plt
plt.figure(figsize=(24, 12))
for i,col in enumerate(df.columns[:6]):
plt.subplot(3,2,i+1)
plt.plot(df.index, df[col], label="Original", linewidth=1.75)
plt.plot(data_imputed.index, data_imputed[col], ':', label="Imputed", linewidth=1.75)
plt.title(col)
plt.legend()
import pandas as pd
# Read the CSV file with the specified date format and dayfirst=True
data_imputed = pd.read_csv('https://raw.githubusercontent.com/marsgr6/r-scripts/refs/heads/master/data/viz_data/ts_data_precipitation.csv',
parse_dates=[0], dayfirst=True, index_col=0)
# Display the first few rows of the DataFrame to verify
data = data_imputed.head().iloc[:,:5]
data.head()
| QH0889 | QH0894 | PM0141 | PM0501 | PM0431 | |
|---|---|---|---|---|---|
| FECHA | |||||
| 1980-01-01 | 54.52 | 15.78 | 119.3 | 245.5 | 105.94 |
| 1980-02-01 | 54.67 | 30.41 | 120.2 | 31.4 | 92.91 |
| 1980-03-01 | 120.46 | 37.04 | 128.8 | 161.2 | 118.66 |
| 1980-04-01 | 110.47 | 85.36 | 156.6 | 224.3 | 138.17 |
| 1980-05-01 | 92.28 | 46.10 | 75.3 | 82.8 | 103.27 |
!pip install tensorflow
Collecting tensorflow Downloading tensorflow-2.19.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.1 kB) Requirement already satisfied: absl-py>=1.0.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (1.4.0) Collecting astunparse>=1.6.0 (from tensorflow) Downloading astunparse-1.6.3-py2.py3-none-any.whl.metadata (4.4 kB) Collecting flatbuffers>=24.3.25 (from tensorflow) Downloading flatbuffers-25.2.10-py2.py3-none-any.whl.metadata (875 bytes) Requirement already satisfied: gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (0.6.0) Collecting google-pasta>=0.1.1 (from tensorflow) Downloading google_pasta-0.2.0-py3-none-any.whl.metadata (814 bytes) Collecting libclang>=13.0.0 (from tensorflow) Downloading libclang-18.1.1-py2.py3-none-manylinux2010_x86_64.whl.metadata (5.2 kB) Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (3.4.0) Requirement already satisfied: packaging in /usr/local/lib/python3.11/dist-packages (from tensorflow) (25.0) Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<6.0.0dev,>=3.20.3 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (5.29.5) Requirement already satisfied: requests<3,>=2.21.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (2.32.3) Requirement already satisfied: setuptools in /usr/local/lib/python3.11/dist-packages (from tensorflow) (75.2.0) Requirement already satisfied: six>=1.12.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (1.17.0) Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (3.1.0) Requirement already satisfied: typing-extensions>=3.6.6 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (4.14.0) Requirement already satisfied: wrapt>=1.11.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (1.17.2) Requirement already satisfied: grpcio<2.0,>=1.24.3 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (1.73.0) Collecting tensorboard~=2.19.0 (from tensorflow) Downloading tensorboard-2.19.0-py3-none-any.whl.metadata (1.8 kB) Requirement already satisfied: keras>=3.5.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (3.8.0) Requirement already satisfied: numpy<2.2.0,>=1.26.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (2.0.2) Requirement already satisfied: h5py>=3.11.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (3.14.0) Requirement already satisfied: ml-dtypes<1.0.0,>=0.5.1 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (0.5.1) Collecting tensorflow-io-gcs-filesystem>=0.23.1 (from tensorflow) Downloading tensorflow_io_gcs_filesystem-0.37.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (14 kB) Collecting wheel<1.0,>=0.23.0 (from astunparse>=1.6.0->tensorflow) Downloading wheel-0.45.1-py3-none-any.whl.metadata (2.3 kB) Requirement already satisfied: rich in /usr/local/lib/python3.11/dist-packages (from keras>=3.5.0->tensorflow) (14.0.0) Requirement already satisfied: namex in /usr/local/lib/python3.11/dist-packages (from keras>=3.5.0->tensorflow) (0.1.0) Requirement already satisfied: optree in /usr/local/lib/python3.11/dist-packages (from keras>=3.5.0->tensorflow) (0.16.0) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2.21.0->tensorflow) (3.4.2) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2.21.0->tensorflow) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2.21.0->tensorflow) (2.4.0) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2.21.0->tensorflow) (2025.6.15) Requirement already satisfied: markdown>=2.6.8 in /usr/lib/python3/dist-packages (from tensorboard~=2.19.0->tensorflow) (3.3.6) Collecting tensorboard-data-server<0.8.0,>=0.7.0 (from tensorboard~=2.19.0->tensorflow) Downloading tensorboard_data_server-0.7.2-py3-none-manylinux_2_31_x86_64.whl.metadata (1.1 kB) Collecting werkzeug>=1.0.1 (from tensorboard~=2.19.0->tensorflow) Downloading werkzeug-3.1.3-py3-none-any.whl.metadata (3.7 kB) Requirement already satisfied: MarkupSafe>=2.1.1 in /usr/local/lib/python3.11/dist-packages (from werkzeug>=1.0.1->tensorboard~=2.19.0->tensorflow) (3.0.2) Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.11/dist-packages (from rich->keras>=3.5.0->tensorflow) (3.0.0) Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.11/dist-packages (from rich->keras>=3.5.0->tensorflow) (2.19.1) Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.11/dist-packages (from markdown-it-py>=2.2.0->rich->keras>=3.5.0->tensorflow) (0.1.2) Downloading tensorflow-2.19.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (644.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 644.9/644.9 MB 1.6 MB/s eta 0:00:00 Downloading astunparse-1.6.3-py2.py3-none-any.whl (12 kB) Downloading flatbuffers-25.2.10-py2.py3-none-any.whl (30 kB) Downloading google_pasta-0.2.0-py3-none-any.whl (57 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 57.5/57.5 kB 4.1 MB/s eta 0:00:00 Downloading libclang-18.1.1-py2.py3-none-manylinux2010_x86_64.whl (24.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24.5/24.5 MB 53.6 MB/s eta 0:00:00 Downloading tensorboard-2.19.0-py3-none-any.whl (5.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.5/5.5 MB 62.4 MB/s eta 0:00:00 Downloading tensorflow_io_gcs_filesystem-0.37.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.1/5.1 MB 64.4 MB/s eta 0:00:00 Downloading tensorboard_data_server-0.7.2-py3-none-manylinux_2_31_x86_64.whl (6.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.6/6.6 MB 65.2 MB/s eta 0:00:00 Downloading werkzeug-3.1.3-py3-none-any.whl (224 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 224.5/224.5 kB 11.5 MB/s eta 0:00:00 Downloading wheel-0.45.1-py3-none-any.whl (72 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 72.5/72.5 kB 6.8 MB/s eta 0:00:00 Installing collected packages: libclang, flatbuffers, wheel, werkzeug, tensorflow-io-gcs-filesystem, tensorboard-data-server, google-pasta, tensorboard, astunparse, tensorflow Successfully installed astunparse-1.6.3 flatbuffers-25.2.10 google-pasta-0.2.0 libclang-18.1.1 tensorboard-2.19.0 tensorboard-data-server-0.7.2 tensorflow-2.19.0 tensorflow-io-gcs-filesystem-0.37.1 werkzeug-3.1.3 wheel-0.45.1
Como primer enfoque, se ha entrenado un modelo base utilizando una red neuronal multicapa (MLP) implementada con sklearn.neural_network.MLPRegressor. Este modelo busca predecir el valor de una variable objetivo (por ejemplo, PM0431) a partir de las variables climáticas observadas en el mismo instante de tiempo.
X como al valor objetivo y, para mejorar la estabilidad del entrenamiento.early_stopping y una fracción de validación del 20% para evitar sobreajuste.La red neuronal cuenta con dos capas ocultas:
MLPRegressor(hidden_layer_sizes=(100, 50), max_iter=1000, ...)
El entrenamiento se realiza con los datos escalados, y la predicción es transformada nuevamente a la escala original del target utilizando inverse_transform.
La gráfica siguiente muestra el desempeño del modelo sobre el conjunto de prueba. Se observa que, a pesar de ser un modelo estático, logra seguir razonablemente bien la tendencia y amplitud de la serie temporal en muchos tramos.
📈 Visualización:

Este modelo MLP estático sirve como una línea base, contra la cual se pueden comparar enfoques más avanzados que sí consideren la secuencia temporal (como LSTM o Transformers). A pesar de sus limitaciones teóricas, su rendimiento práctico sugiere que las variables explicativas contemporáneas contienen información significativa para predecir el target en el mismo instante.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.neural_network import MLPRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error
# Load data
data = pd.read_csv(
'https://raw.githubusercontent.com/marsgr6/r-scripts/refs/heads/master/data/viz_data/ts_data_precipitation.csv',
parse_dates=[0], dayfirst=True, index_col=0)
data = data.iloc[:, :5]
X = data.iloc[:, :-1]
y = data.iloc[:, -1]
# Time-based split
split_index = int(len(data) * 0.7)
X_train, X_test = X.iloc[:split_index], X.iloc[split_index:]
y_train, y_test = y.iloc[:split_index], y.iloc[split_index:]
# Scale features and target
scaler_X = StandardScaler()
scaler_y = StandardScaler()
X_train_scaled = scaler_X.fit_transform(X_train)
X_test_scaled = scaler_X.transform(X_test)
y_train_scaled = scaler_y.fit_transform(y_train.values.reshape(-1, 1)).flatten()
y_test_scaled = scaler_y.transform(y_test.values.reshape(-1, 1)).flatten()
# Define and fit MLP
mlp = MLPRegressor(hidden_layer_sizes=(100, 50), max_iter=1000, random_state=42, early_stopping=True, validation_fraction=0.2)
mlp.fit(X_train_scaled, y_train_scaled)
# Predict
y_pred_scaled = mlp.predict(X_test_scaled)
y_pred = scaler_y.inverse_transform(y_pred_scaled.reshape(-1, 1)).flatten()
assert len(y_test) == len(y_pred), "Test and prediction length mismatch"
# Evaluate
mse = mean_squared_error(y_test, y_pred)
print(f"MSE: {mse:.2f}")
print(f"RMSE: {np.sqrt(mse):.2f}")
print(f"R²: {r2_score(y_test, y_pred):.2f}")
print(f"MAE: {mean_absolute_error(y_test, y_pred):.2f}")
# Plot
plt.figure(figsize=(20, 5))
plt.plot(y_train.index, y_train, label="Train", color='blue')
plt.plot(y_test.index, y_test, label="Test - True", color='green')
plt.plot(y_test.index, y_pred, label="Test - Predicted", color='red', linestyle='--')
plt.title("Time Series - True and Predicted Values (PM0431)")
plt.ylabel("Target Variable (PM0431)")
plt.xlabel("Date")
plt.legend()
plt.tight_layout()
plt.show()
MSE: 1743.48 RMSE: 41.75 R²: 0.29 MAE: 28.77
En esta segunda etapa, implementamos un modelo de red neuronal MLP que incorpora información temporal de forma explícita. Para lograrlo, se añaden valores pasados (lags) de todas las variables como nuevas entradas, lo que convierte al modelo en una forma de predicción con ventana deslizante. Aunque el MLP no es un modelo secuencial en sí, este enfoque permite que aprenda relaciones temporales a corto plazo de manera indirecta.
| Característica | Modelo Estático | Modelo con Lags |
|---|---|---|
| Input | Variables en el tiempo t | Variables en t, t−1, t−2, t−3 |
| ¿Captura dinámica temporal? | ❌ No | ✅ Sí (por medio de lags) |
| Tipo de modelo | MLP clásico | MLP clásico |
| Dependencia temporal | No explícita | Implícita mediante features |
MLPRegressor(hidden_layer_sizes=(100,), max_iter=500)
for col in df.columns:
for lag in [1, 2, 3]:
df_lagged[f'{col}_lag{lag}'] = df[col].shift(lag)
Esta técnica enriquece el conjunto de variables sin recurrir a arquitecturas secuenciales como LSTM o Transformers.

El modelo ahora intenta predecir el valor futuro de la variable objetivo utilizando únicamente información pasada, lo que representa un desafío mayor, pero también una evaluación más realista del rendimiento predictivo.
RMSE (error cuadrático medio) y R² (coeficiente de determinación) como métricas principales.Este enfoque basado en MLP con variables de retardo es una forma efectiva de introducir información temporal en modelos densos tradicionales. Representa un buen punto de partida antes de avanzar hacia modelos secuenciales más sofisticados como LSTM o Transformers. Es útil para evaluar si la historia reciente de las variables contiene suficiente información para anticipar su evolución.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.neural_network import MLPRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, r2_score
# --- Load Data ---
data = pd.read_csv(
'https://raw.githubusercontent.com/marsgr6/r-scripts/refs/heads/master/data/viz_data/ts_data_precipitation.csv',
parse_dates=[0], dayfirst=True, index_col=0
)
data = data.iloc[:, :5] # Use only 5 columns
target_col = 'PM0431' # Last column as target
# --- Create Lag Features (time modeling) ---
def create_lagged_features_all(df, lags=[1, 2, 3]):
df_lagged = df.copy()
for col in df.columns:
for lag in lags:
df_lagged[f'{col}_lag{lag}'] = df[col].shift(lag)
df_lagged.dropna(inplace=True)
return df_lagged
lags = [1, 2, 3]
data_lagged = create_lagged_features_all(data, lags)
# --- Train/Test Split ---
split_index = int(len(data_lagged) * 0.7)
train = data_lagged.iloc[:split_index]
test = data_lagged.iloc[split_index:]
X_train = train.drop(columns=[target_col])
y_train = train[target_col]
X_test = test.drop(columns=[target_col])
y_test = test[target_col]
# --- Scale Features ---
scaler_X = StandardScaler()
X_train_scaled = scaler_X.fit_transform(X_train)
X_test_scaled = scaler_X.transform(X_test)
# --- Train MLP Model ---
mlp = MLPRegressor(hidden_layer_sizes=(100,), max_iter=500, random_state=42)
mlp.fit(X_train_scaled, y_train)
# --- Predict ---
y_pred = mlp.predict(X_test_scaled)
#y_pred = np.clip(y_pred, 0, None) # avoid negatives
# --- Plot Results ---
plt.figure(figsize=(20, 5))
plt.plot(y_train.index, y_train, label='Train', color='blue')
plt.plot(y_test.index, y_test, label='True', color='green')
plt.plot(y_test.index, y_pred, label='Predicted', color='red', linestyle='--')
plt.title('MLP Forecast with Lag Features '+target_col)
plt.xlabel('Date')
plt.ylabel(target_col)
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
# --- Evaluation ---
mse = mean_squared_error(y_test, y_pred)
print(f"MSE: {mse:.2f}")
print(f"RMSE: {np.sqrt(mse):.2f}")
print(f"R²: {r2_score(y_test, y_pred):.2f}")
print(f"MAE: {mean_absolute_error(y_test, y_pred):.2f}")
/usr/local/lib/python3.11/dist-packages/sklearn/neural_network/_multilayer_perceptron.py:691: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet. warnings.warn(
MSE: 2638.68 RMSE: 51.37 R²: -0.07 MAE: 41.18
El modelo MLP configurado con lags como features fue entrenado para predecir la variable de precipitación PM0431, utilizando una ventana de 3 pasos hacia atrás para cada variable predictora. El conjunto de entrenamiento cubre el 70% inicial de la serie y el 30% restante se utiliza para prueba.
| Métrica | Valor |
|---|---|
| MSE | 2638.68 |
| RMSE | 51.37 |
| MAE | 41.18 |
| R² (R-cuadrado) | −0.07 |

Podemos observar que el modelo logra capturar en términos generales el rango y la estacionalidad, pero tiene dificultades importantes para seguir los valores extremos (especialmente los picos de alta precipitación después de 2010). Además, tiende a suavizar las oscilaciones, lo que indica cierta rigidez en su capacidad de respuesta a eventos bruscos.
Rendimiento global bajo (R² = −0.07) El valor negativo de $R^2$ indica que el modelo no mejora respecto a una predicción trivial (por ejemplo, la media del target). Este es un indicio claro de underfitting: el modelo no está captando adecuadamente la complejidad de la señal.
RMSE y MAE altos en comparación con la escala de la serie Un RMSE de 51.37 sugiere errores sustanciales considerando que la mayoría de los valores reales de PM0431 están entre 0 y 300. El MAE de 41.18 indica que incluso los errores absolutos medios son significativos.
Predicciones amortiguadas El modelo tiende a subestimar picos y sobreestimar valles, un comportamiento típico de modelos densos como el MLP cuando:
Aunque este modelo de MLP con lags representa un paso hacia la incorporación de información temporal, su rendimiento en la predicción de PM0431 sugiere que se requieren enfoques más potentes para captar la complejidad del fenómeno. Sirve como línea base comparativa, pero deja claro que es necesario avanzar hacia modelos más especializados en series temporales.
El modelo implementado corresponde a una versión simplificada y adaptada del Transformer para tareas de predicción de series temporales univariadas. A diferencia de los Transformers originales para lenguaje natural, este está diseñado para trabajar con ventanas fijas de datos multivariados y producir una única predicción futura.
Input (shape: n_steps × n_features)
└── Dense layer (proyección al espacio d_model)
└── Positional Encoding (seno y coseno)
└── MultiHead Attention (self-attention)
└── Layer Normalization
└── GlobalAveragePooling1D
└── Dense (ReLU)
└── Dense (output: 1)
| Componente | Descripción |
|---|---|
| Input Layer | Recibe una ventana temporal de forma (n_steps, n_features) |
| Dense (proyección) | Proyecta los inputs al espacio dimensional d_model=32 |
| Positional Encoding | Añade codificación posicional usando funciones seno/coseno |
| Multi-Head Attention | Atención propia (self-attention) entre elementos de la secuencia |
| LayerNormalization | Estabiliza la salida de la capa de atención |
| GlobalAveragePooling1D | Reduce la secuencia a una sola representación agregada |
| Dense (ReLU) | Capa oculta intermedia con dff=64 unidades y activación no lineal |
| Dense (output) | Capa final que genera una única predicción (regresión escalar) |
d_model): 32num_heads): 2dff): 6430% de la serie), y se visualiza sobre datos escalados.RMSE: 52.13
R²: -0.10
Esta arquitectura muestra cómo los Transformers pueden ser adaptados para series temporales incluso en contextos simples. Sin embargo, su efectividad depende fuertemente del volumen de datos, el diseño de ventanas y la sintonización fina. Aunque la estructura permite capturar dependencias temporales complejas, en este caso particular no logró superar modelos más simples como el MLP con lags, posiblemente debido a restricciones en la cantidad de datos y la profundidad del modelo.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, r2_score
import tensorflow as tf
from tensorflow.keras import layers
# Load data
data = pd.read_csv(
'https://raw.githubusercontent.com/marsgr6/r-scripts/refs/heads/master/data/viz_data/ts_data_precipitation.csv',
parse_dates=[0], dayfirst=True, index_col=0
).iloc[:, :5]
# Normalize all data
scaler = MinMaxScaler()
data_scaled = scaler.fit_transform(data)
# Sequence preparation
def create_sequences(data, n_steps=3):
X, y = [], []
for i in range(n_steps, len(data)):
X.append(data[i - n_steps:i, :-1]) # features
y.append(data[i, -1]) # target
return np.array(X), np.array(y)
n_steps = 3
X_all, y_all = create_sequences(data_scaled, n_steps)
# Split
split_index = int(len(X_all) * 0.7)
X_train, X_test = X_all[:split_index], X_all[split_index:]
y_train, y_test = y_all[:split_index], y_all[split_index:]
# --- Positional Encoding Layer ---
class PositionalEncoding(layers.Layer):
def __init__(self, sequence_length, d_model):
super(PositionalEncoding, self).__init__()
self.pos_encoding = self.positional_encoding(sequence_length, d_model)
def get_angles(self, pos, i, d_model):
angle_rates = 1 / tf.pow(10000., (2 * (i // 2)) / tf.cast(d_model, tf.float32))
return pos * angle_rates
def positional_encoding(self, sequence_length, d_model):
angle_rads = self.get_angles(
pos=tf.range(sequence_length, dtype=tf.float32)[:, tf.newaxis],
i=tf.range(d_model, dtype=tf.float32)[tf.newaxis, :],
d_model=d_model
)
sines = tf.math.sin(angle_rads[:, 0::2])
cosines = tf.math.cos(angle_rads[:, 1::2])
pos_encoding = tf.concat([sines, cosines], axis=-1)
return pos_encoding[tf.newaxis, ...]
def call(self, x):
return x + self.pos_encoding[:, :tf.shape(x)[1], :]
# Define model
def build_transformer(n_steps, n_features, d_model=32, num_heads=2, dff=64):
inputs = tf.keras.Input(shape=(n_steps, n_features))
x = layers.Dense(d_model)(inputs)
x = PositionalEncoding(n_steps, d_model)(x)
x = layers.MultiHeadAttention(num_heads=num_heads, key_dim=d_model)(x, x)
x = layers.LayerNormalization()(x)
x = layers.GlobalAveragePooling1D()(x)
x = layers.Dense(dff, activation='relu')(x)
outputs = layers.Dense(1)(x)
return tf.keras.Model(inputs=inputs, outputs=outputs)
model = build_transformer(n_steps, X_train.shape[2])
model.compile(optimizer='adam', loss='mse')
model.fit(X_train, y_train, epochs=200, verbose=0)
# Predict
y_pred = model.predict(X_test).flatten()
# Inverse scaling of target
target_scaler = MinMaxScaler()
target_scaler.min_, target_scaler.scale_ = scaler.min_[-1], scaler.scale_[-1]
y_pred_inv = target_scaler.inverse_transform(y_pred.reshape(-1, 1)).flatten()
y_test_inv = target_scaler.inverse_transform(y_test.reshape(-1, 1)).flatten()
# Prepare date indices
dates = data.index[n_steps:]
train_dates = dates[:split_index]
test_dates = dates[split_index:]
# Plot
plt.figure(figsize=(20, 5))
plt.plot(train_dates, y_train, label="Train (scaled)", color='blue')
plt.plot(test_dates, y_test, label="Test - True (scaled)", color='green')
plt.plot(test_dates, y_pred, label="Test - Predicted (scaled)", color='red', linestyle='--')
plt.title("Transformer with Positional Encoding - Scaled Prediction")
plt.xlabel("Date")
plt.ylabel("Target (Scaled)")
plt.legend()
plt.tight_layout()
plt.show()
# Evaluation
rmse = mean_squared_error(y_test_inv, y_pred_inv)
r2 = r2_score(y_test_inv, y_pred_inv)
print(f"RMSE: {rmse:.2f}")
print(f"R²: {r2:.2f}")
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 29ms/step
RMSE: 2718.76 R²: -0.10
Se implementó un modelo Transformer para predecir la variable PM0431 (precipitación), utilizando ventanas temporales como secuencia de entrada. El modelo incorpora codificación posicional, permitiéndole entender el orden de las observaciones dentro de cada ventana, algo crucial en datos secuenciales.
| Métrica | Valor |
|---|---|
| MSE | 2718.76 |
| RMSE | 52.13 |
| R² | −0.10 |

Rendimiento bajo (R² < 0) El valor negativo del coeficiente de determinación $R^2 = -0.10$ indica que el modelo Transformer, en este caso, no logra superar un modelo trivial basado en la media del target. Es decir, no hay una ganancia real en explicar la variabilidad del dato respecto a un baseline simple.
Error cuadrático moderadamente alto El RMSE de 52.13 indica errores del mismo orden de magnitud que en el MLP con lags. El Transformer no logró mejorar significativamente ese desempeño, lo cual puede deberse a varios factores técnicos (ver más abajo).
Predicciones suavizadas y tendencia centralizada Visualmente, se observa que el modelo sigue parcialmente la tendencia del target pero tiene dificultades para capturar:
Este comportamiento es común cuando:
Aunque el Transformer es una arquitectura avanzada con capacidad para modelar dependencias a largo plazo, en este escenario específico no ha logrado mejorar el desempeño del modelo MLP con lags. Esto puede deberse al tamaño limitado de datos, hiperparámetros no óptimos, o a la alta variabilidad de la serie. Aun así, sirve como un punto de partida importante para explorar modelos más sofisticados o bien optimizados.
# 3. Load and prepare data
data = pd.read_csv(
'https://raw.githubusercontent.com/marsgr6/r-scripts/refs/heads/master/data/viz_data/ts_data_precipitation.csv',
parse_dates=['FECHA'], dayfirst=True, index_col='FECHA'
)
data[data.columns[:-5]]
| FECHA | QH0889 | QH0894 | PM0141 | PM0501 | |
|---|---|---|---|---|---|
| 0 | 1980-01-01 | 54.52 | 15.78 | 119.30 | 245.50 |
| 1 | 1980-02-01 | 54.67 | 30.41 | 120.20 | 31.40 |
| 2 | 1980-03-01 | 120.46 | 37.04 | 128.80 | 161.20 |
| 3 | 1980-04-01 | 110.47 | 85.36 | 156.60 | 224.30 |
| 4 | 1980-05-01 | 92.28 | 46.10 | 75.30 | 82.80 |
| ... | ... | ... | ... | ... | ... |
| 463 | 2018-08-01 | 61.78 | 41.29 | 96.42 | 134.98 |
| 464 | 2018-09-01 | 61.78 | 41.29 | 96.79 | 139.45 |
| 465 | 2018-10-01 | 61.78 | 41.29 | 96.10 | 131.59 |
| 466 | 2018-11-01 | 61.78 | 41.29 | 96.70 | 138.36 |
| 467 | 2018-12-01 | 61.78 | 41.29 | 99.03 | 137.36 |
468 rows × 5 columns
data.head(20)
| QH0889 | QH0894 | PM0141 | PM0501 | PM0431 | |
|---|---|---|---|---|---|
| FECHA | |||||
| 1980-01-01 | 54.52 | 15.78 | 119.3 | 245.5 | 105.94 |
| 1980-02-01 | 54.67 | 30.41 | 120.2 | 31.4 | 92.91 |
| 1980-03-01 | 120.46 | 37.04 | 128.8 | 161.2 | 118.66 |
| 1980-04-01 | 110.47 | 85.36 | 156.6 | 224.3 | 138.17 |
| 1980-05-01 | 92.28 | 46.10 | 75.3 | 82.8 | 103.27 |
| 1980-06-01 | 153.65 | 63.71 | 91.0 | 178.7 | 123.51 |
| 1980-07-01 | 141.59 | 80.77 | 116.0 | 154.0 | 128.45 |
| 1980-08-01 | 65.98 | 39.99 | 69.0 | 168.3 | 97.58 |
| 1980-09-01 | 74.51 | 38.70 | 46.4 | 189.5 | 88.05 |
| 1980-10-01 | 78.25 | 68.73 | 157.9 | 345.7 | 127.23 |
| 1980-11-01 | 57.88 | 46.31 | 172.9 | 103.8 | 118.84 |
| 1980-12-01 | 56.10 | 39.16 | 111.3 | 19.9 | 99.12 |
| 1981-01-01 | 41.52 | 27.34 | 54.8 | 29.9 | 77.85 |
| 1981-02-01 | 82.59 | 52.75 | 168.0 | 141.2 | 128.54 |
| 1981-03-01 | 79.13 | 61.74 | 250.1 | 97.4 | 143.81 |
| 1981-04-01 | 91.65 | 57.94 | 145.4 | 130.0 | 133.96 |
| 1981-05-01 | 54.31 | 51.65 | 114.5 | 140.1 | 123.01 |
| 1981-06-01 | 106.14 | 48.13 | 88.3 | 100.8 | 121.69 |
| 1981-07-01 | 90.24 | 44.20 | 89.5 | 99.3 | 117.01 |
| 1981-08-01 | 53.67 | 33.33 | 64.3 | 121.8 | 98.97 |
data_lagged.head(20)
| QH0889 | QH0894 | PM0141 | PM0501 | PM0431 | QH0889_lag1 | QH0889_lag2 | QH0889_lag3 | QH0894_lag1 | QH0894_lag2 | QH0894_lag3 | PM0141_lag1 | PM0141_lag2 | PM0141_lag3 | PM0501_lag1 | PM0501_lag2 | PM0501_lag3 | PM0431_lag1 | PM0431_lag2 | PM0431_lag3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| FECHA | ||||||||||||||||||||
| 1980-04-01 | 110.47 | 85.36 | 156.6 | 224.3 | 138.17 | 120.46 | 54.67 | 54.52 | 37.04 | 30.41 | 15.78 | 128.8 | 120.2 | 119.3 | 161.2 | 31.4 | 245.5 | 118.66 | 92.91 | 105.94 |
| 1980-05-01 | 92.28 | 46.10 | 75.3 | 82.8 | 103.27 | 110.47 | 120.46 | 54.67 | 85.36 | 37.04 | 30.41 | 156.6 | 128.8 | 120.2 | 224.3 | 161.2 | 31.4 | 138.17 | 118.66 | 92.91 |
| 1980-06-01 | 153.65 | 63.71 | 91.0 | 178.7 | 123.51 | 92.28 | 110.47 | 120.46 | 46.10 | 85.36 | 37.04 | 75.3 | 156.6 | 128.8 | 82.8 | 224.3 | 161.2 | 103.27 | 138.17 | 118.66 |
| 1980-07-01 | 141.59 | 80.77 | 116.0 | 154.0 | 128.45 | 153.65 | 92.28 | 110.47 | 63.71 | 46.10 | 85.36 | 91.0 | 75.3 | 156.6 | 178.7 | 82.8 | 224.3 | 123.51 | 103.27 | 138.17 |
| 1980-08-01 | 65.98 | 39.99 | 69.0 | 168.3 | 97.58 | 141.59 | 153.65 | 92.28 | 80.77 | 63.71 | 46.10 | 116.0 | 91.0 | 75.3 | 154.0 | 178.7 | 82.8 | 128.45 | 123.51 | 103.27 |
| 1980-09-01 | 74.51 | 38.70 | 46.4 | 189.5 | 88.05 | 65.98 | 141.59 | 153.65 | 39.99 | 80.77 | 63.71 | 69.0 | 116.0 | 91.0 | 168.3 | 154.0 | 178.7 | 97.58 | 128.45 | 123.51 |
| 1980-10-01 | 78.25 | 68.73 | 157.9 | 345.7 | 127.23 | 74.51 | 65.98 | 141.59 | 38.70 | 39.99 | 80.77 | 46.4 | 69.0 | 116.0 | 189.5 | 168.3 | 154.0 | 88.05 | 97.58 | 128.45 |
| 1980-11-01 | 57.88 | 46.31 | 172.9 | 103.8 | 118.84 | 78.25 | 74.51 | 65.98 | 68.73 | 38.70 | 39.99 | 157.9 | 46.4 | 69.0 | 345.7 | 189.5 | 168.3 | 127.23 | 88.05 | 97.58 |
| 1980-12-01 | 56.10 | 39.16 | 111.3 | 19.9 | 99.12 | 57.88 | 78.25 | 74.51 | 46.31 | 68.73 | 38.70 | 172.9 | 157.9 | 46.4 | 103.8 | 345.7 | 189.5 | 118.84 | 127.23 | 88.05 |
| 1981-01-01 | 41.52 | 27.34 | 54.8 | 29.9 | 77.85 | 56.10 | 57.88 | 78.25 | 39.16 | 46.31 | 68.73 | 111.3 | 172.9 | 157.9 | 19.9 | 103.8 | 345.7 | 99.12 | 118.84 | 127.23 |
| 1981-02-01 | 82.59 | 52.75 | 168.0 | 141.2 | 128.54 | 41.52 | 56.10 | 57.88 | 27.34 | 39.16 | 46.31 | 54.8 | 111.3 | 172.9 | 29.9 | 19.9 | 103.8 | 77.85 | 99.12 | 118.84 |
| 1981-03-01 | 79.13 | 61.74 | 250.1 | 97.4 | 143.81 | 82.59 | 41.52 | 56.10 | 52.75 | 27.34 | 39.16 | 168.0 | 54.8 | 111.3 | 141.2 | 29.9 | 19.9 | 128.54 | 77.85 | 99.12 |
| 1981-04-01 | 91.65 | 57.94 | 145.4 | 130.0 | 133.96 | 79.13 | 82.59 | 41.52 | 61.74 | 52.75 | 27.34 | 250.1 | 168.0 | 54.8 | 97.4 | 141.2 | 29.9 | 143.81 | 128.54 | 77.85 |
| 1981-05-01 | 54.31 | 51.65 | 114.5 | 140.1 | 123.01 | 91.65 | 79.13 | 82.59 | 57.94 | 61.74 | 52.75 | 145.4 | 250.1 | 168.0 | 130.0 | 97.4 | 141.2 | 133.96 | 143.81 | 128.54 |
| 1981-06-01 | 106.14 | 48.13 | 88.3 | 100.8 | 121.69 | 54.31 | 91.65 | 79.13 | 51.65 | 57.94 | 61.74 | 114.5 | 145.4 | 250.1 | 140.1 | 130.0 | 97.4 | 123.01 | 133.96 | 143.81 |
| 1981-07-01 | 90.24 | 44.20 | 89.5 | 99.3 | 117.01 | 106.14 | 54.31 | 91.65 | 48.13 | 51.65 | 57.94 | 88.3 | 114.5 | 145.4 | 100.8 | 140.1 | 130.0 | 121.69 | 123.01 | 133.96 |
| 1981-08-01 | 53.67 | 33.33 | 64.3 | 121.8 | 98.97 | 90.24 | 106.14 | 54.31 | 44.20 | 48.13 | 51.65 | 89.5 | 88.3 | 114.5 | 99.3 | 100.8 | 140.1 | 117.01 | 121.69 | 123.01 |
| 1981-09-01 | 39.96 | 33.00 | 64.1 | 158.6 | 96.75 | 53.67 | 90.24 | 106.14 | 33.33 | 44.20 | 48.13 | 64.3 | 89.5 | 88.3 | 121.8 | 99.3 | 100.8 | 98.97 | 117.01 | 121.69 |
| 1981-10-01 | 40.81 | 32.43 | 73.8 | 83.3 | 89.12 | 39.96 | 53.67 | 90.24 | 33.00 | 33.33 | 44.20 | 64.1 | 64.3 | 89.5 | 158.6 | 121.8 | 99.3 | 96.75 | 98.97 | 117.01 |
| 1981-11-01 | 30.06 | 30.14 | 33.0 | 157.7 | 73.59 | 40.81 | 39.96 | 53.67 | 32.43 | 33.00 | 33.33 | 73.8 | 64.1 | 64.3 | 83.3 | 158.6 | 121.8 | 89.12 | 96.75 | 98.97 |
En el contexto de series temporales como PM0431, es importante reflexionar sobre si conviene incluir las variables en el tiempo actual (por ejemplo QH0889, QH0894, PM0141, PM0501) o trabajar únicamente con sus lags como predictores.
Cuando el objetivo es predecir el valor futuro (por ejemplo PM0431_{t+1}), lo recomendable es usar únicamente los lags. Esto asegura que el modelo dependa solo de información disponible en el momento de la predicción, evitando que se base en variables del futuro que no conoceríamos en producción.
Si el objetivo es estimar PM0431 en el mismo instante (por ejemplo, por ser costoso o no disponible en tiempo real - missing value imputation), entonces es válido y útil conservar las variables actuales como predictores. Estas pueden aportar información complementaria valiosa sobre el estado del sistema en ese momento.
👉 Conclusión: Para un modelo de forecasting de PM0431 en escenarios reales, es preferible construir el dataset solo con lags. Esto evita dependencia de datos que no estarán disponibles al momento de proyectar hacia el futuro.
# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the dataset
if end_ix > len(sequences)-1:
break
# gather input and output parts of the pattern
#seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
seq_x, seq_y = sequences[i:end_ix,:], sequences[end_ix, -1]
X.append(seq_x)
y.append(seq_y)
return np.array(X), np.array(y)
dataset = np.hstack([data[[col]] for col in data.columns])
dataset.shape
(468, 5)
# choose a number of time steps
n_steps = 3
# convert into input/output
X, y = split_sequences(dataset, n_steps)
print(X.shape, y.shape)
(465, 3, 5) (465,)
X[1], y[1]
(array([[ 54.67, 30.41, 120.2 , 31.4 , 92.91],
[120.46, 37.04, 128.8 , 161.2 , 118.66],
[110.47, 85.36, 156.6 , 224.3 , 138.17]]),
np.float64(103.27))
data.head()
| QH0889 | QH0894 | PM0141 | PM0501 | PM0431 | |
|---|---|---|---|---|---|
| FECHA | |||||
| 1980-01-01 | 54.52 | 15.78 | 119.3 | 245.5 | 105.94 |
| 1980-02-01 | 54.67 | 30.41 | 120.2 | 31.4 | 92.91 |
| 1980-03-01 | 120.46 | 37.04 | 128.8 | 161.2 | 118.66 |
| 1980-04-01 | 110.47 | 85.36 | 156.6 | 224.3 | 138.17 |
| 1980-05-01 | 92.28 | 46.10 | 75.3 | 82.8 | 103.27 |
La estructura que se presenta:
(array([
[ 54.67, 30.41, 120.2 , 31.4 , 92.91], # t-3
[120.46, 37.04, 128.8 , 161.2 , 118.66], # t-2
[110.47, 85.36, 156.6 , 224.3 , 138.17] # t-1
]),
np.float64(103.27)) # PM0431_t (target)
representa un ejemplo multivariado con las siguientes características:
✅ 5 variables en cada paso temporal: Estas corresponden a un conjunto de 4 variables predictoras y la propia variable objetivo (PM0431) en sus lags.
✅ 3 lags (ventana de tiempo): Cada fila representa el estado del sistema en un paso temporal anterior:
✅ Salida esperada:
El valor de PM0431 en t (103.27), que es el target que el modelo debe predecir.
👉 Esta estructura refleja un ejemplo típico de entrada para modelos multivariados de series temporales: una ventana de tiempo de 3 pasos (lags), con 5 variables por paso, que el modelo utiliza para predecir el valor futuro de PM0431. Es la forma adecuada para alimentar redes LSTM o Transformers, preservando el orden temporal y las interacciones multivariadas.
Se proporciona un conjunto de datos de series temporales con variables medidas diariamente en una estación ambiental. Cada registro incluye:
DATE: Fecha de la medición.PRECIPITATION(mm): Precipitación total diaria en milímetros.WINDDIR(deg): Dirección del viento en grados.WINDSPEED(m/s): Velocidad del viento en metros por segundo.PARTICULAS(microg/m3): Concentración de partículas finas (PM2.5) en el aire, en microgramos por metro cúbico.Desarrollar un modelo de aprendizaje supervisado que permita predecir la concentración de partículas PM2.5 (PARTICULAS(microg/m3)) a partir de las condiciones meteorológicas actuales y pasadas.
Preprocesamiento
DATE a tipo datetime.MinMaxScaler o StandardScaler).Construcción de features
PRECIPITATION, WINDDIR, WINDSPEED (por ejemplo, lags de 1 a 3 días).Modelado
PM2.5 a partir de las variables seleccionadas.RMSE, MAE, R².Visualización
Aplicar un modelo Transformer o LSTM para explorar si el uso de memoria temporal explícita mejora la capacidad predictiva respecto al MLP con lags fijos.
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.impute import KNNImputer
import numpy as np
# Load the dataset
df = pd.read_csv("https://raw.githubusercontent.com/marsgr6/EN-online/refs/heads/main/data/Belisario_missing_values.csv",
parse_dates=["DATE"], dayfirst=True, index_col="DATE")
# Save mask of missing values before imputation
missing_mask = df.isna()
# KNN Imputation
imputer = KNNImputer(n_neighbors=2)
df_imputed = pd.DataFrame(imputer.fit_transform(df), columns=df.columns, index=df.index)
# Plot each time series with red markers for imputed points
for col in df.columns:
plt.figure(figsize=(18, 3))
plt.plot(df_imputed.index, df_imputed[col], label=f"{col} (KNN imputed)", color='blue')
if missing_mask[col].any():
plt.scatter(
df_imputed.index[missing_mask[col]],
df_imputed[col][missing_mask[col]],
color='red',
label='Imputed (KNN)',
zorder=5
)
plt.title(f"Time Series with KNN Imputation: {col}")
plt.xlabel("Date")
plt.ylabel(col)
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
!pip install tensorflow
Collecting tensorflow Downloading tensorflow-2.19.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.1 kB) Requirement already satisfied: absl-py>=1.0.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (1.4.0) Collecting astunparse>=1.6.0 (from tensorflow) Downloading astunparse-1.6.3-py2.py3-none-any.whl.metadata (4.4 kB) Collecting flatbuffers>=24.3.25 (from tensorflow) Downloading flatbuffers-25.2.10-py2.py3-none-any.whl.metadata (875 bytes) Requirement already satisfied: gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (0.6.0) Collecting google-pasta>=0.1.1 (from tensorflow) Downloading google_pasta-0.2.0-py3-none-any.whl.metadata (814 bytes) Collecting libclang>=13.0.0 (from tensorflow) Downloading libclang-18.1.1-py2.py3-none-manylinux2010_x86_64.whl.metadata (5.2 kB) Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (3.4.0) Requirement already satisfied: packaging in /usr/local/lib/python3.11/dist-packages (from tensorflow) (25.0) Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<6.0.0dev,>=3.20.3 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (5.29.5) Requirement already satisfied: requests<3,>=2.21.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (2.32.3) Requirement already satisfied: setuptools in /usr/local/lib/python3.11/dist-packages (from tensorflow) (75.2.0) Requirement already satisfied: six>=1.12.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (1.17.0) Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (3.1.0) Requirement already satisfied: typing-extensions>=3.6.6 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (4.14.0) Requirement already satisfied: wrapt>=1.11.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (1.17.2) Requirement already satisfied: grpcio<2.0,>=1.24.3 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (1.73.0) Collecting tensorboard~=2.19.0 (from tensorflow) Downloading tensorboard-2.19.0-py3-none-any.whl.metadata (1.8 kB) Requirement already satisfied: keras>=3.5.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (3.8.0) Requirement already satisfied: numpy<2.2.0,>=1.26.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (2.0.2) Requirement already satisfied: h5py>=3.11.0 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (3.14.0) Requirement already satisfied: ml-dtypes<1.0.0,>=0.5.1 in /usr/local/lib/python3.11/dist-packages (from tensorflow) (0.5.1) Collecting tensorflow-io-gcs-filesystem>=0.23.1 (from tensorflow) Downloading tensorflow_io_gcs_filesystem-0.37.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (14 kB) Collecting wheel<1.0,>=0.23.0 (from astunparse>=1.6.0->tensorflow) Downloading wheel-0.45.1-py3-none-any.whl.metadata (2.3 kB) Requirement already satisfied: rich in /usr/local/lib/python3.11/dist-packages (from keras>=3.5.0->tensorflow) (14.0.0) Requirement already satisfied: namex in /usr/local/lib/python3.11/dist-packages (from keras>=3.5.0->tensorflow) (0.1.0) Requirement already satisfied: optree in /usr/local/lib/python3.11/dist-packages (from keras>=3.5.0->tensorflow) (0.16.0) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2.21.0->tensorflow) (3.4.2) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2.21.0->tensorflow) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2.21.0->tensorflow) (2.4.0) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2.21.0->tensorflow) (2025.6.15) Requirement already satisfied: markdown>=2.6.8 in /usr/lib/python3/dist-packages (from tensorboard~=2.19.0->tensorflow) (3.3.6) Collecting tensorboard-data-server<0.8.0,>=0.7.0 (from tensorboard~=2.19.0->tensorflow) Downloading tensorboard_data_server-0.7.2-py3-none-manylinux_2_31_x86_64.whl.metadata (1.1 kB) Collecting werkzeug>=1.0.1 (from tensorboard~=2.19.0->tensorflow) Downloading werkzeug-3.1.3-py3-none-any.whl.metadata (3.7 kB) Requirement already satisfied: MarkupSafe>=2.1.1 in /usr/local/lib/python3.11/dist-packages (from werkzeug>=1.0.1->tensorboard~=2.19.0->tensorflow) (3.0.2) Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.11/dist-packages (from rich->keras>=3.5.0->tensorflow) (3.0.0) Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.11/dist-packages (from rich->keras>=3.5.0->tensorflow) (2.19.1) Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.11/dist-packages (from markdown-it-py>=2.2.0->rich->keras>=3.5.0->tensorflow) (0.1.2) Downloading tensorflow-2.19.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (644.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 644.9/644.9 MB 1.7 MB/s eta 0:00:00 Downloading astunparse-1.6.3-py2.py3-none-any.whl (12 kB) Downloading flatbuffers-25.2.10-py2.py3-none-any.whl (30 kB) Downloading google_pasta-0.2.0-py3-none-any.whl (57 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 57.5/57.5 kB 4.9 MB/s eta 0:00:00 Downloading libclang-18.1.1-py2.py3-none-manylinux2010_x86_64.whl (24.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24.5/24.5 MB 84.5 MB/s eta 0:00:00 Downloading tensorboard-2.19.0-py3-none-any.whl (5.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.5/5.5 MB 117.4 MB/s eta 0:00:00 Downloading tensorflow_io_gcs_filesystem-0.37.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.1/5.1 MB 95.6 MB/s eta 0:00:00 Downloading tensorboard_data_server-0.7.2-py3-none-manylinux_2_31_x86_64.whl (6.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.6/6.6 MB 122.6 MB/s eta 0:00:00 Downloading werkzeug-3.1.3-py3-none-any.whl (224 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 224.5/224.5 kB 16.8 MB/s eta 0:00:00 Downloading wheel-0.45.1-py3-none-any.whl (72 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 72.5/72.5 kB 5.2 MB/s eta 0:00:00 Installing collected packages: libclang, flatbuffers, wheel, werkzeug, tensorflow-io-gcs-filesystem, tensorboard-data-server, google-pasta, tensorboard, astunparse, tensorflow Successfully installed astunparse-1.6.3 flatbuffers-25.2.10 google-pasta-0.2.0 libclang-18.1.1 tensorboard-2.19.0 tensorboard-data-server-0.7.2 tensorflow-2.19.0 tensorflow-io-gcs-filesystem-0.37.1 werkzeug-3.1.3 wheel-0.45.1
import torch
import torch.nn as nn
from torch.utils.data import DataLoader, Dataset
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
# Load and prepare the data
url = "https://raw.githubusercontent.com/marsgr6/ml-online/main/data/BAJFINANCE_day__with_indicators_.csv"
df = pd.read_csv(url, parse_dates=["date"])
series = df["TYPPRICE"].dropna().values
train_series = series[:-300]
test_series = series[-300:]
scaler = MinMaxScaler()
train_scaled = scaler.fit_transform(train_series.reshape(-1, 1)).flatten()
test_scaled = scaler.transform(test_series.reshape(-1, 1)).flatten()
# Dataset
class TimeSeriesDataset(Dataset):
def __init__(self, data, seq_length):
self.data = torch.tensor(data, dtype=torch.float32)
self.seq_length = seq_length
def __len__(self):
return len(self.data) - self.seq_length
def __getitem__(self, idx):
return (self.data[idx:idx+self.seq_length],
self.data[idx+self.seq_length])
seq_len = 10
train_dataset = TimeSeriesDataset(train_scaled, seq_len)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
# Transformer model with custom encoder layer
class TransformerTimeSeries(nn.Module):
def __init__(self, seq_length, d_model=32, nhead=2, num_layers=2):
super().__init__()
self.embedding = nn.Linear(1, d_model)
encoder_layer = nn.TransformerEncoderLayer(
d_model=d_model,
nhead=nhead,
dropout=0.01,
dim_feedforward=512,
activation='gelu',
batch_first=True
)
self.transformer = nn.TransformerEncoder(encoder_layer, num_layers=num_layers)
self.fc = nn.Linear(d_model, 1)
def forward(self, x):
x = x.unsqueeze(-1) # (B, L) → (B, L, 1)
x = self.embedding(x) # (B, L, 1) → (B, L, d_model)
x = self.transformer(x) # (B, L, d_model)
x = x[:, -1, :] # Usar último token
return self.fc(x).squeeze(-1) # (B,)
# Model training
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = TransformerTimeSeries(seq_length=seq_len).to(device)
loss_fn = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
model.train()
for epoch in range(30):
total_loss = 0
for x, y in train_loader:
x, y = x.to(device), y.to(device)
pred = model(x)
loss = loss_fn(pred, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
total_loss += loss.item()
print(f"[Transformer128] Epoch {epoch+1}, Loss: {total_loss / len(train_loader):.4f}")
# Evaluation
def create_test_windows(data, seq_length):
return torch.stack([data[i:i+seq_length] for i in range(len(data) - seq_length)])
model.eval()
with torch.no_grad():
test_tensor = torch.tensor(test_scaled, dtype=torch.float32)
test_inputs = create_test_windows(test_tensor, seq_len).to(device)
predictions = model(test_inputs).cpu().numpy()
predictions_original = scaler.inverse_transform(predictions.reshape(-1, 1)).flatten()
# Metrics
real = test_series[seq_len:]
rmse = np.sqrt(mean_squared_error(real, predictions_original))
mae = mean_absolute_error(real, predictions_original)
mape = np.mean(np.abs((real - predictions_original) / real)) * 100
r2 = r2_score(real, predictions_original)
# Plot results
plt.figure(figsize=(12, 6))
plt.plot(real, label="Valor real", linewidth=2)
plt.plot(predictions_original, label="Predicción Transformer (d_model=128)", linestyle="--")
plt.title("Predicción de Serie Temporal con Transformer (Escalado MinMax)")
plt.xlabel("Tiempo")
plt.ylabel("TYPPRICE")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
# Show metrics
print(f"RMSE: {rmse:.2f}")
print(f"MAE: {mae:.2f}")
print(f"MAPE: {mape:.2f}%")
print(f"R²: {r2:.4f}")
[Transformer128] Epoch 1, Loss: 0.0358 [Transformer128] Epoch 2, Loss: 0.0008 [Transformer128] Epoch 3, Loss: 0.0005 [Transformer128] Epoch 4, Loss: 0.0005 [Transformer128] Epoch 5, Loss: 0.0005 [Transformer128] Epoch 6, Loss: 0.0005 [Transformer128] Epoch 7, Loss: 0.0005 [Transformer128] Epoch 8, Loss: 0.0005 [Transformer128] Epoch 9, Loss: 0.0004 [Transformer128] Epoch 10, Loss: 0.0004 [Transformer128] Epoch 11, Loss: 0.0004 [Transformer128] Epoch 12, Loss: 0.0004 [Transformer128] Epoch 13, Loss: 0.0003 [Transformer128] Epoch 14, Loss: 0.0004 [Transformer128] Epoch 15, Loss: 0.0003 [Transformer128] Epoch 16, Loss: 0.0004 [Transformer128] Epoch 17, Loss: 0.0003 [Transformer128] Epoch 18, Loss: 0.0003 [Transformer128] Epoch 19, Loss: 0.0003 [Transformer128] Epoch 20, Loss: 0.0004 [Transformer128] Epoch 21, Loss: 0.0004 [Transformer128] Epoch 22, Loss: 0.0003 [Transformer128] Epoch 23, Loss: 0.0003 [Transformer128] Epoch 24, Loss: 0.0003 [Transformer128] Epoch 25, Loss: 0.0004 [Transformer128] Epoch 26, Loss: 0.0003 [Transformer128] Epoch 27, Loss: 0.0003 [Transformer128] Epoch 28, Loss: 0.0003 [Transformer128] Epoch 29, Loss: 0.0003 [Transformer128] Epoch 30, Loss: 0.0003
RMSE: 159.67 MAE: 127.16 MAPE: 1.81% R²: 0.9427
len(predictions_original)
300
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
# Load and preprocess data
url = "https://raw.githubusercontent.com/marsgr6/ml-online/main/data/BAJFINANCE_day__with_indicators_.csv"
df = pd.read_csv(url, parse_dates=["date"])
series = df["TYPPRICE"].dropna().values
# Parameters
seq_len = 10
horizon = 1
train_series = series[:-300]
test_series = series[-300:]
# Scaling
scaler = MinMaxScaler()
train_scaled = scaler.fit_transform(train_series.reshape(-1, 1)).flatten()
test_scaled = scaler.transform(test_series.reshape(-1, 1)).flatten()
full_test_input = np.concatenate([train_scaled[-seq_len:], test_scaled])
# Dataset for multi-step
class MultiStepTimeSeriesDataset(Dataset):
def __init__(self, data, seq_length, horizon):
self.data = torch.tensor(data, dtype=torch.float32)
self.seq_length = seq_length
self.horizon = horizon
def __len__(self):
return len(self.data) - self.seq_length - self.horizon + 1
def __getitem__(self, idx):
x = self.data[idx:idx + self.seq_length]
y = self.data[idx + self.seq_length : idx + self.seq_length + self.horizon]
return x, y
# Prepare training
train_dataset = MultiStepTimeSeriesDataset(train_scaled, seq_len, horizon)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
# Transformer model with custom encoder layer
class TransformerTimeSeries(nn.Module):
def __init__(self, seq_length, d_model=32, nhead=2, num_layers=2):
super().__init__()
self.embedding = nn.Linear(1, d_model)
encoder_layer = nn.TransformerEncoderLayer(
d_model=d_model,
nhead=nhead,
dropout=0.01,
dim_feedforward=512,
activation='gelu',
batch_first=True
)
self.transformer = nn.TransformerEncoder(encoder_layer, num_layers=num_layers)
self.fc = nn.Linear(d_model, horizon)
def forward(self, x):
x = x.unsqueeze(-1)
x = self.embedding(x)
x = self.transformer(x)
x = x[:, -1, :]
return self.fc(x)
# Training
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = MultiStepTransformer(seq_length=seq_len, horizon=horizon).to(device)
loss_fn = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
model.train()
for epoch in range(30):
total_loss = 0
for x, y in train_loader:
x, y = x.to(device), y.to(device)
pred = model(x)
loss = loss_fn(pred, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
total_loss += loss.item()
print(f"[MultiStep] Epoch {epoch+1}, Loss: {total_loss / len(train_loader):.4f}")
# Evaluation
def create_multi_step_test_windows(data, seq_length, horizon):
return torch.stack([data[i:i+seq_length] for i in range(len(data) - seq_length - horizon + 1)])
test_tensor = torch.tensor(full_test_input, dtype=torch.float32)
test_inputs = create_multi_step_test_windows(test_tensor, seq_len, horizon).to(device)
model.eval()
with torch.no_grad():
preds = model(test_inputs).cpu().numpy()
# Flatten predictions and compare
predictions_flat = preds.flatten()
true_flat = test_series[:len(predictions_flat)]
predictions_original = scaler.inverse_transform(predictions_flat.reshape(-1, 1)).flatten()
# Metrics
rmse = np.sqrt(mean_squared_error(true_flat, predictions_original))
mae = mean_absolute_error(true_flat, predictions_original)
mape = np.mean(np.abs((true_flat - predictions_original) / true_flat)) * 100
r2 = r2_score(true_flat, predictions_original)
# Plot
plt.figure(figsize=(12, 6))
plt.plot(true_flat, label="Valor real", linewidth=2)
plt.plot(predictions_original, label="Predicción (multi-step)", linestyle="--")
plt.title("Predicción de Serie Temporal Multi-Step con Transformer")
plt.xlabel("Tiempo")
plt.ylabel("TYPPRICE")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
# Show metrics
print(f"RMSE: {rmse:.2f}")
print(f"MAE: {mae:.2f}")
print(f"MAPE: {mape:.2f}%")
print(f"R²: {r2:.4f}")
[MultiStep] Epoch 1, Loss: 0.1855 [MultiStep] Epoch 2, Loss: 0.0057 [MultiStep] Epoch 3, Loss: 0.0039 [MultiStep] Epoch 4, Loss: 0.0033 [MultiStep] Epoch 5, Loss: 0.0026 [MultiStep] Epoch 6, Loss: 0.0022 [MultiStep] Epoch 7, Loss: 0.0021 [MultiStep] Epoch 8, Loss: 0.0020 [MultiStep] Epoch 9, Loss: 0.0015 [MultiStep] Epoch 10, Loss: 0.0014 [MultiStep] Epoch 11, Loss: 0.0013 [MultiStep] Epoch 12, Loss: 0.0011 [MultiStep] Epoch 13, Loss: 0.0012 [MultiStep] Epoch 14, Loss: 0.0011 [MultiStep] Epoch 15, Loss: 0.0009 [MultiStep] Epoch 16, Loss: 0.0010 [MultiStep] Epoch 17, Loss: 0.0009 [MultiStep] Epoch 18, Loss: 0.0009 [MultiStep] Epoch 19, Loss: 0.0008 [MultiStep] Epoch 20, Loss: 0.0007 [MultiStep] Epoch 21, Loss: 0.0007 [MultiStep] Epoch 22, Loss: 0.0006 [MultiStep] Epoch 23, Loss: 0.0008 [MultiStep] Epoch 24, Loss: 0.0006 [MultiStep] Epoch 25, Loss: 0.0005 [MultiStep] Epoch 26, Loss: 0.0007 [MultiStep] Epoch 27, Loss: 0.0008 [MultiStep] Epoch 28, Loss: 0.0007 [MultiStep] Epoch 29, Loss: 0.0006 [MultiStep] Epoch 30, Loss: 0.0004
RMSE: 152.16 MAE: 114.18 MAPE: 1.70% R²: 0.9478
# Re-execute the full multivariate Transformer model due to reset
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
# Load multivariate dataset
url = "https://raw.githubusercontent.com/marsgr6/ml-online/main/data/BAJFINANCE_day__with_indicators_.csv"
df = pd.read_csv(url, parse_dates=["date"])
df = df.dropna()
# Features and target
features = ['TYPPRICE', 'sma5', 'ema5', 'RSI14', 'macd510', 'upperband', 'lowerband']
data = df[features].values
# Train/test split
seq_len = 10
train_data = data[:-300]
test_data = data[-300:]
# Normalize all variables jointly
scaler = MinMaxScaler()
train_scaled = scaler.fit_transform(train_data)
test_scaled = scaler.transform(test_data)
# Combine last `seq_len` from training with test for prediction
full_test_input = np.concatenate([train_scaled[-seq_len:], test_scaled])
# Dataset for multivariate input
class MultivariateTimeSeriesDataset(Dataset):
def __init__(self, data, seq_length):
self.data = torch.tensor(data, dtype=torch.float32)
self.seq_length = seq_length
def __len__(self):
return len(self.data) - self.seq_length
def __getitem__(self, idx):
x = self.data[idx:idx + self.seq_length, :]
y = self.data[idx + self.seq_length, 0] # predict TYPPRICE only
return x, y
# Prepare training set
train_dataset = MultivariateTimeSeriesDataset(train_scaled, seq_len)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
# Transformer model for multivariate input
class MultivariateTransformer(nn.Module):
def __init__(self, input_dim, d_model=128, nhead=4, num_layers=2):
super().__init__()
self.embedding = nn.Linear(input_dim, d_model)
encoder_layer = nn.TransformerEncoderLayer(
d_model=d_model, nhead=nhead, dropout=0.1,
dim_feedforward=256, activation='gelu', batch_first=True
)
self.transformer = nn.TransformerEncoder(encoder_layer, num_layers=num_layers)
self.fc = nn.Linear(d_model, 1)
def forward(self, x):
x = self.embedding(x) # (B, L, input_dim) → (B, L, d_model)
x = self.transformer(x) # (B, L, d_model)
x = x[:, -1, :] # último token
return self.fc(x).squeeze(-1)
# Train model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = MultivariateTransformer(input_dim=len(features)).to(device)
loss_fn = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
model.train()
for epoch in range(10):
total_loss = 0
for x, y in train_loader:
x, y = x.to(device), y.to(device)
pred = model(x)
loss = loss_fn(pred, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
total_loss += loss.item()
print(f"[Multivariate] Epoch {epoch+1}, Loss: {total_loss / len(train_loader):.4f}")
# Create test windows
def create_multivariate_windows(data, seq_length):
return torch.stack([torch.tensor(data[i:i+seq_length], dtype=torch.float32)
for i in range(len(data) - seq_length)])
test_inputs = create_multivariate_windows(full_test_input, seq_len).to(device)
# Make predictions
model.eval()
with torch.no_grad():
predictions = model(test_inputs).cpu().numpy()
predictions_original = scaler.inverse_transform(
np.hstack([predictions.reshape(-1, 1), np.zeros((len(predictions), len(features) - 1))])
)[:, 0]
# Align real values with predictions
# Since we create (len(full_test_input) - seq_len) predictions
# This corresponds to len(test_data) + seq_len - seq_len = len(test_data)
real = test_data[:, 0]
real = real[-len(predictions_original):]
# Metrics
rmse = np.sqrt(mean_squared_error(real, predictions_original))
mae = mean_absolute_error(real, predictions_original)
mape = np.mean(np.abs((real - predictions_original) / real)) * 100
r2 = r2_score(real, predictions_original)
# Plot
plt.figure(figsize=(12, 6))
plt.plot(real, label="Valor real", linewidth=2)
plt.plot(predictions_original, label="Predicción multivariada Transformer", linestyle="--")
plt.title("Predicción de Serie Temporal Multivariada con Transformer")
plt.xlabel("Tiempo")
plt.ylabel("TYPPRICE")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
# Show metrics
print(f"RMSE: {rmse:.2f}")
print(f"MAE: {mae:.2f}")
print(f"MAPE: {mape:.2f}%")
print(f"R²: {r2:.4f}")
[Multivariate] Epoch 1, Loss: 0.1398 [Multivariate] Epoch 2, Loss: 0.0049 [Multivariate] Epoch 3, Loss: 0.0032 [Multivariate] Epoch 4, Loss: 0.0027 [Multivariate] Epoch 5, Loss: 0.0021 [Multivariate] Epoch 6, Loss: 0.0018 [Multivariate] Epoch 7, Loss: 0.0017 [Multivariate] Epoch 8, Loss: 0.0014 [Multivariate] Epoch 9, Loss: 0.0013 [Multivariate] Epoch 10, Loss: 0.0011
RMSE: 525.04 MAE: 483.16 MAPE: 6.77% R²: 0.3788
La figura muestra la comparación entre el valor real del TYPPRICE (línea azul) y la predicción generada por el Transformer multivariado (línea naranja discontinua).
Se observa que el modelo sigue la forma general de la serie, es decir, es capaz de capturar la tendencia global y algunas fluctuaciones principales. Sin embargo:
1️⃣ Ajuste post-entrenamiento (calibración simple)
pred + (media_real - media_pred).2️⃣ Mejorar el entrenamiento
d_model, más cabezas de atención).3️⃣ Ingeniería de features
4️⃣ Regularización y data augmentation